You can follow this conversation by subscribing to the comment feed for this post.

Nick, are you an insomniac? This sounds like my attempts to construct rhopalic sentences at 3 a.m. From your post I assume you are referring to liquid water--otherwise the Himalayas, the source of the Ganges et al., would defeat your elevation hypothesis.

Shangwen: not (currently) insomniac, but I have gotten sort of obsessive about this question.

Yep, I should have mentioned that I'm assuming all water is liquid: no ice or snow.

The rules seem to be different for mountainous regions vs. flat; Google for

Seekell "A fractal-based approach to lake size distributions"

(PDF at virginia.edu). Not sure it'll help, but that's where I'd start.

Tom: lovely find! A bit tough going (for me) but it seems to be in the right area.

9% IS surprisingly low. So someone with high liquidity preference should move to Finland?

"I am trying to calculate the percentage of a country that would be covered by water under certain simplifying assumptions."

At the Water Survey of Canada, back in the 1970s, we used maps instead of simplifying assumptions. Thank God I'm a geographer.

Wiki says that 25% of Finland's Lakeland region is lakes. But then, with a name like that, I strongly suspect sample selection bias! The article Tom found says it's around 3% across all countries in the world.

BTW, the link to macro, in case it wasn't obvious: recessions are like lakes.

Hmm
Now I'm curious how the Dwarf fortress game generate its worlds.

If it's assumptions are similar to yours, you could just use a few randomly generated worlds and see what would be the % of lakes.

Sandwichman: From the research Tom suggests, it seems that geographers who read journals like "Geophysical Research Letters" are interested in both maps and models, and then comparing the two, and seeing what they can learn from the comparison. Like me.

Simon: neat idea. Solve it by simulation. But (I think) it ought to be solvable analytically. (But I'm bad at math).

Nick, another way of thinking about the problem. Pick a point in Canada at random. The point is, by definition, either lake or not lake. If it's a lake, then it's of some depth. By knowing how the depth of a point is correlated with the depth of all surrounding points, one (one in this case meaning someone other than you or me) could work out the expected size of a randomly selected lake. One could then work out the expected size of a randomly selected area of non-lake. If the two are the same, then 50% of the land is lake.

One obvious reason for the expected size of the lakes to be smaller than the expected size of the not-lakes is that there's not much water. The way you've stated the problem (no evaporation, lots of rainfall,no rivers) that shouldn't be an issue.

Another reason that the two might differ in expected size is that the geological process through which the two are formed is different. E.g. lakes are formed by crashing meteorites, mountains are formed by volcanoes or the movement of tectonic plates.

Don't know if this helps.

Nick: Here's how I'd do it by simulation. A python coder should be able to do that in 15 minutes. I'D need a few hours.

Generate a NxN grid of cells
Each cell has 4 properties:
Can flow Up (true/false)
Can flow Down (true/false)
Can flow Left (true/false)
Can flow Right (true/false)
These cells are determined randomly, except for the contour cells, which have at least 1 side that is determined as one (the ocean).

From each cell, perform a breath-firsth search to the ocean by trying every options.

I'm too old for solving analytically :)

Nick: A map is a model.

I think your argument for P*=.5 works, at least works in the 1D case, and probably in general. Model this as a simple random walk, each point is +-1 from its neighbors with probability .5 each. To be an island you also need the first and last segments to go up but this won't matter in the limit. This meets your assumptions, but is not fully general.

Now, its not too hard to believe that any random walk is equally likely as the random walk where each elevation is negated (assuming the starting point is 0). Ignoring first and last pieces and switching the middle gives another equally likely island. The average water coverage of these two is .5 because every valley becomes a hill and every hill a valley (as you said). Averaging every island with its mirror counts all islands and gives P*=.5.

This feels like it should also work in 2D, but since its not quite a random walk I'm less sure. If you change the probability distribution on the conditional neighbor heights to be some continuous distribution with mean 0 and positive probability everywhere I think the mirror island should again have the same probability as the original so everything would go through the same.

Frances: I sort of like that idea. But there's one asymmetry between lakes and non-lakes. There is one big non-lake in Canada, and lots of little non-lakes that are islands in lakes. I think your idea works for islands in lakes, but the big non-lake is an island in the ocean. My brain hurts.

It's like Zaq's idea, which is like turning Canada upside-down, and asking if it looks the same. (Which is how I was thinking about it too, except for the problem that Canada would be under the ocean, except for the extreme coast, because I was thinking if business cycles would look the same upside-down, and how we could measure the difference.) But I still think Zaq's intuition is basically right. I was also trying to figure out if what works in 1D would also work in 2D (or maybe it should be 2D and 3D). There are only 2 places water can flow out of a 2D U, and it will flow out of the lowest side. But there is a whole circle of plaes water can flow out of a 3D U, and it will flow out the lowest point.

Simon: if it would take you a few hours, it would probably take me a few months!

P* of 0.5 seems far too high for area which is two dimensional since only one route out of an area needs to be lower to allow it to drain. The drain could be almost one dimensional within your correlation distance, so perhaps 1 in 5, 0.2, or 1 in 9, 0.11. Then there is the problem of average elevation when the processes that form it, uplifting and erosion, generate slopes from mountain ranges to sea and generally below sea to the continental shelf. Then even if one area is locally high, it may be surrounded by even higher areas and submerged, while even if one area is locally low, it may be surrounded by even lower areas promoting erosion.

Monte Carlo simulation: Just shoot points at random at a map of Canada. Really fine darts, say. Virtual darts, the centers of randomly selected Google images. They wouldn't even have to be the same scale, just zero in on the center. Add up the total and find the percentage that hits water. With a computer, it should probably be easy to come up with a better than one significant digit answer. Say one thousand shots? Maybe more.

One issue is how far out to sea do you want to count? The interior of Hudson's Bay, etc?

Lord: "P* of 0.5 seems far too high for area which is two dimensional since only one route out of an area needs to be lower to allow it to drain."

Yep, if space were cubic, like in my attempted "solution" in the post, there would only be a 1/8 probability that one point were lower then all 4 adjacent points. But if all 4 adjacent points were wet (which has a probability P*/4, our point would also be wet if it were lower than the highest of those 4 adjacent points. So P* must be greater than 1/8.

Rivers erode the lowest point surrounding a lake, which makes elevation non-random (contrary to my assumption). But what this means is that P*-Pi gives us a measure of how much erosion due to rivers there has been.

@Zaq:

I think your argument for P*=.5 works, at least works in the 1D case, and probably in general.

Actually, in the 1D case I think Nick's model produces P* >> .5. *Any* depression either causes everything to its left or everything to its right to become covered in water, so the only place that even can be water-free is the highest point on the island and some of its surrounding area.

Actually, at that point you presumably just call that water "ocean" and get P* = 0. So the 1D case is degenerate. (Sorry for double post but didn't realize this until after.)

I'd recommend not reinventing this wheel.

You're probably going to have to deal with fractals and power law distributions. This stuff shows up a lot - length of coastline, branches of a tree, size of a circulatory system, price movements, surface area of many natural objects...

and Geophysical Research Letters is a good place to look, but the math is a bit specialized.

"Keywords:

size distribution;
lakes;
global limnology;
fractal geometry;
power law;
Pareto distribution

[1] The abundance and size distribution of lakes is critical to assessing the role of lakes in regional and global biogeochemical processes. Lakes are fractal but do not always conform to the power law size-distribution typically associated with fractal geographical features. Here, we evaluate the fractal geometry of lakes with the goal of explaining apparently inconsistent observations of power law and non–power law lake size-distributions. The power law size-distribution is a special case for lakes near the mean elevation. Lakes in flat regions are power law distributed, while lakes in mountainous regions deviate from power law distributions. Empirical analyses of lake size data sets from the Adirondack Mountains in New York and the flat island of Gotland in Sweden support this finding. Our approach provides a unifying framework for lake size-distributions, indicates that small lakes cannot dominate total lake surface area, and underscores the importance of regional hypsometry in influencing lake size-distributions."

This is a good start on the math and some applications:

It is going to be way less than p=0.5

you need to find bowl shaped locations and not just U shaped...

Through any piece of territory, we can cut an east-west cross section, and a north-south cross section.

In order to hold water it must be U shaped both on the north-west cross section, and on the East-west cross section. That suggestst P=0.25 at best. Then it must not be so steap that the water doestn't just slide off, and goes somewhere else nor so flat that it is just a puddle and evaporates away.

Add to that erosion -- over time, debris fills in lakes and turns them into meadows, and cuts rivers into the flatter parts, carrying the water off to the sea.

Lastly, there is going to be some element of politics. Does the boarder of Canada end at the shore of Lake Superior, or does Canada get half that surface area and the US get the other half? How about the Caspian sea -- yes, it is called a sea, but technically it is a lake. I don't think that the countries boarding the Caspain make a claim to a significan fraction of the surface area.

In 1991,Caspîan was treated as interior waters. Draw a perpendicular to the shore and go out till you meet another perpendicular halfway from each shore. There are no international waters on the Caspian

“Evaporation is very small relative to rainfall, and the surface is impermeable to water. . . .” These assumptions look very unrealistic, accounting for the discrepancy between your figure and Wikipedia’s. The combination of evaporation and descending into underground aquifers results in there being many dry regions that are surrounded by higher contour lines.

Doug: in 1991,as before, the Caspian was treated as internal waters. Sectors were drawn thus: a perpendicular is drawn from the shore till it meet the opposing perpendicular midway.

Isn't that (1/2)^4, 1/16? Then there is the beaver effect.

Nick,

Lakes are a temporary geological feature, because they tend to fill in with debris.

Also, tectonic processes have a preference for creating peaks rather than holes, because air is easier to push aside than rock. This means mountain ranges, which means rivers rather than lakes.

And evaporation matters -- Australia has a reasonable amount of lake area, but a lot of them are empty most of the time.

However, for a recent, flat, wet land surface I could believe a 25% lake figure based on geometrical arguments, at least before farmers start digging ditches.

Further north in Canada, although I haven't actually been there, my impression was that there are large areas that transcend the lake/island dichotomy by being ice during the winter and mosquito-infested bog during the summer.

I'm unsure what you mean by "surface area." Do you mean the surface area of a 3-dimensional surface, or the area of the 2D surface defined by the country's border? Are lakes included when you measure surface area?

If you mean the 3D surface area, then you can use fractals to get infinite surface area at any average elevation.

I meant to add: if you mean 2D surface area, then you can get infinite elevation at any surface area. Your assumptions seem to prohibit cliffs, but you can still get arbitrarily close to cliff.

Lord: "Isn't that (1/2)^4, 1/16?"

Damn. I think you are right. I can't even do arithmetic!

"Then there is the beaver effect."

Beavers don't exist in a world where elevation is random. The difference between P* and Pi would also tell us something interesting about the prevalence of beavers.

oblivious. I think we can allow almost-cliffs, but overhangs would be tricky to model. I was thinking about "surface area" as a 2D photo from space, or area on a map. But (almost) any way you want to model it to get a tractable problem is OK with me.

Thomas: "However, for a recent, flat, wet land surface I could believe a 25% lake figure based on geometrical arguments,..."

Are you able to explain why you think that 25% might be right? (Not that I can really explain why I think 50% might be right!)

Yep, erosion of rivers and filling in of lakes would tend to reduce it over time. So that P*-Pi would give us a measure of the "age" of a landscape.

Peter N: I thought I might be (trying to) reinvent the wheel. But if I am reinventing the wheel, someone by now would have told me what the answer is, I think.

Do you want an answer for real geography or are you more interested in your random matrix model?

Isolated wet areas are enclosed by continuous contour lines. Each such line defines a potential lake at that level. Areas can also be wet at a particular level by having an monotonic path to a sink (sea or dry lake) at that level. If you have 0 evaporation and 0 absorption, all possible enclose contour lines will be full. Then level of lakes with outlets will depend on the relative flow rates of the outlets, though, of course, at maximum fill, all lakes will have outlets.

"Lakes occupy only a small percentage of Earth’s total surface area. Recent studies estimate the total surface area of lakes at about 4,200,000 km2 — only about 2.8% of the planet’s land surface area (or less than 1% of the Earth’s total surface area)1. It is thought that large lakes such as the Great Lakes of the U.S. and Canada make up a large portion of this total, although small lakes (<0.1 km2) do not commonly appear on maps and may not be counted in typical surveys1. Lakes and other freshwaters range in size from 0.001 km2 for the smallest lakes and ponds, and the largest, the Caspian Sea, at 378,119 km1, 2 (see Facts and Figures section for more lake size facts). Although a few extremely large lakes may dominate area, small lakes dominate the total number of lakes around the world."

"Based on data collected on lake size distribution, it is estimated there are over 304 million lakes in the world."

"Using modeling techniques, limnologists have been able to calculate lake distributions for various regions of the world, using lakes of surface area 1-10 km2 as a model1. This data had led to the discovery of some continental trends in lake distribution. In North America, lake distribution tends to be highest in the eastern United States and Canada. Central American countries as well as the northwestern regions of Canada and the United States are also relatively dense with lakes, but the central United States and Mexico have the smallest lake density. South America shows trends of high lake densities in its northeastern and central-eastern regions, but the southern and western regions of the continent have few lakes. Europe shows a fairly uniform distribution, ranging from 601-1000 lakes per 1,000,000 km2, but shows a high density along the northern and southern regions of western Europe. Africa shows few lakes in its northern and southern regions, but possesses a much higher lake density in its central regions. In Asia, much of the lake density is within Russia, southern China, Japan, India, and other surrounding countries. Finally, Australia shows a low distribution of lakes throughout, with the only exceptions being coastal regions of Australia and Papua New Guinea1."

"Basins of attraction on random topography.
Schorghofer N, Rothman DH.
Source
Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.

Abstract
We investigate the consequences of fluid flowing on a continuous surface upon the geometric and statistical distribution of the flow. We find that the ability of a surface to collect water by its mere geometrical shape is proportional to the curvature of the contour line divided by the local slope. Consequently, rivers tend to lie in locations of high curvature and flat slopes. Gaussian surfaces are introduced as a model of random topography. For Gaussian surfaces the relation between convergence and slope is obtained analytically. The convergence of flow lines correlates positively with drainage area, so that lower slopes are associated with larger basins. As a consequence, we explain the observed relation between the local slope of a landscape and the area of the drainage basin geometrically. To some extent, the slope-area relation comes about not because of fluvial erosion of the landscape, but because of the way rivers choose their path. Our results are supported by numerically generated surfaces as well as by real landscapes."

In case you missed it:
If the problem is just trying to find the percentage of water in Canada, the answer is a
Monte Carlo simulation: Just shoot points at random at a map of Canada. Really fine darts, say. Virtual darts, the centers of randomly selected Google images. They wouldn't even have to be the same scale, just zero in on the center. Or just pick random coordinates over a map of Canada. Add up the total and find the percentage that hits water. With a computer, it should probably be easy to come up with a better than one significant digit answer. Say one thousand shots for 2 significant digits? Maybe more.

One issue is how far out to sea do you want to count? The interior of Hudson's Bay, etc?

Anyway, there's lots of stuff on Monte Carlo simulations on the web. Extensive literature. Not only can you find the percentage of water, but with enough points, you should be able to make a chart of how much land is at any particular elevation, and with enough points, how much water is at any particular elevation.

If you're trying to do a simulation of a 'real' landscape, much more difficult. To do it right, you probably have to simulate geologic processes, since those are what create the landscape in the first place.

Nick,
I wanted to refresh my Python skills last evening, so here it is ;
https://dl.dropboxusercontent.com/u/16967144/LacPy.zip

1) Install python
2) unzip the zip file somewhere, "say c:\temp"
3) type "python c:\temp\lac.py"
4) it outputs csv files with the number of cells stuck in a lake or not. Also, a quick map showing which cells are "holes" (cannot flow anywhere) and one showing which cells are part of a lake (can only flow in a hole)

The output fora 300x300 simulation is included. Took maybe 10-20 minutes to run. 41 465 out of 90 000 cells are part of a lake (46%)
-------------
2 things
A) this is totally unrealistic
B) this is an upper limit, due to simplification.
-----

B) The way I modelled this is more like someone looking for a way to slide down to the ocean than lakes. This is because my water doesnt "fill lakes", but simpy looks for a way to slide down to the ocean. A mountain in the middle of a lake in the middle of the continent will count as part of the lake in my simulation (no way to slide from the top of the mountain to the ocean without climbing up to exit the (now empy) lake

I cannot change this and make the lake fill your currents "geology spec", because "can flow North"" or cannot flow North" is not enough information. I would need altitude too. Imagine a X,Y grid, bottom left corner is (0,0). Now (0,0) is lower than (0,1) and (1,0). Now image that (1,1) is lower than (0,1), but higher than (1,0). What are their altitudes? Imagine there is 300x300 cells ..

A) I would believe that the slope between km2 and km3 is correlated a lot to the slope between km1 and km2, not only to the altitude level.

I thought for a while to make this make sense to me. I kept imagining stalagmite-shaped countries (or countries with a lot of stalagmites), and couldn't see how surface area and average elevation were connected. I'm not very concerned with imitating a realistic geology. I tried to think up additional constraints that didn't reduce the question to something trivial, and eventually came up with this:

You have a drum. You draw a bunch of closed shapes on the drumhead. At time T=0, you hit the drum in n places. Each hit causes an initial depression in the drum surface of m.

Take a snapshot of the drumhead at time T=t. This surface is your map M. Flatten the surface of M outside he shapes you drew, and call this new surface M*.

This doesn't solve your problem but it defines a bunch of constraints that probably help to make the problem tractable, it might help in phrasing the problem, and it puts the problem into a well-understood mathematical framework.

Peter N: "Do you want an answer for real geography or are you more interested in your random matrix model?"

The latter. So we can then compare real geography of real countries with the random matrix model. To figure out how much different real countries are non-random.

greg: I think you are looking at real countries. Yep, throwing darts at a map of Canada, and seeing how many land in water, would be a good way of doing that. But that's not the question here.

Simonc: I was really optimistic when you said "46%". Because that looked like the sort of answer you would get, if my gut was right, but you still had a finite-sized (even though large) country.

I'm not 100% sure I follow the rest. But if you are asking: "from what percentage of points would I be unable to find some way to reach the ocean by going downhill every step of the way?" that would seem to be smaller than the percentage of lakes. For example, a castle on a mound with a moat surrounding it. It would be impossible to get from castle to ocean downhill all the way, but the castle should be dry, even if the moat is wet. So I'm surprised your simulation gave you a number as high as 46%. [Edit: I should have said "as *low* as 46%". Simonc points out my mistake below. NR]

oblivious: at one point my mind was going in exactly the same directions as yours. Get a flat circle of thin metal, hammer random dents in it (except near the edges), *but then turn it over and hammer some more random dents in it* (same number both sides). Then hold it flat around the edges, and spray it with water. Then turn it over and repeat. Wet areas would be dry when you turn it over, and dry areas would be wet. So it should be 50% wet, on average.

The circle of metal, the drumhead, and a globe are all closed systems. A country is not a closed system: I can dig a lake in Vancouver Island and send the dirt to Australia to build a new mountain. Vancouver island would get a pool of water while Australia would get elevation. This finite "closed system" assumption is how elevation and water coverage are connected, which is what I wasn't grasping.

The metal plate and the globe are closed systems for different reasons. The metal plate is closed because every depression on one side is elevation on the other side. The globe is a closed system because every hole in one place is a hill someplace else (conservation of dirt). There's an additional complexity in that we we can make a "hill" or "hole" on the ocean floor, which effects the elevation of *everything else* implicitly by changing sea level. Symmetrically, digging a lake would cause sea level to fall unless you dumped the dirt in the ocean (conservation of water).

Nick: I wrote that then I realized I wasnt clear at all, but you understood correctly. Here's what I meant to say. in 1d.

Persons starts from point A.
Goes down to point B
then, the slope is positive from point B to C.
Then the slop is negative to point D, the ocean.

Point A could very well be above point C and be dry (the castle in a moat, or an island in lake superior).
Given the current specs, I cannot know if point A is point C, so I assume that it isnt, and point A is considered "wet".

Or put another way, I think that the "percentage of points where I am unable to find some way to reach the ocean by going downhill" is HIGHER than the percentage of lake, not lower and that your castle with a moat example prooves it. The castle itself is dry (not a lake) but you cannot reach the ocean without ever going uphill (and my simulation considers it a lake)

That is why I'm saying my 46% (over a finite 300x300 grid) is an overestimation, but I'm not sure by how much.

To be able to tell if point A is above point C (and dry), we need to know more than "higher/lower than immediate neighbour".

Nick:
oblivious: at one point my mind was going in exactly the same directions as yours. Get a flat circle of thin metal, hammer random dents in it (except near the edges), *but then turn it over and hammer some more random dents in it* (same number both sides). Then hold it flat around the edges, and spray it with water. Then turn it over and repeat. Wet areas would be dry when you turn it over, and dry areas would be wet. So it should be 50% wet, on average.

­I think it would be >50% on average. Here's a 1d graph:

If I understand you, a point is wet if it would be under water if maximally flooded. That is every point in the flooded surface would have risen so that there was a monotonic path to the sea from it. Terrain that is maximally flooded has a considerably higher percentage of surface water than average land, if on average 2.8% of the world's land area is covered by lakes.

Your random dent model should have very different statistics from your original and probably from real terrain. It won't be very fractal.

And it's not too surprising that it is 50% wet, if you hammered equally from both sides. The two sides are identical in the statistical sense. If you're handed one of your dented sheets, there's no way to tell to tell which side is the "top". You've built in what you're trying to prove.

Now look at some real 3d contour maps. There's no doubt which side is the top. You can try the experiment of seeing if both side will retain the same amount of water. I'm betting they won't.

SimonC "That is why I'm saying my 46% (over a finite 300x300 grid) is an overestimation, but I'm not sure by how much."

You are right. I said it wrong on my comment above. I have added an Edit to correct my previous mistake.

SimonC @ 12:44. Hmmm. Dunno. There's a slight mistake in your diagram, because the blue line is too long, but that doesn't affect your point. But when upside down your country would either have a cliff on both coasts or else all the country inside the 2 coasts would be below sea-level. But I think you have a point, and it makes the hammered plate thought-experiment not work quite right. Which I think is Peter N's point too.

This was maybe already raised in the discussion but with your assumptions there would be more than 50% of an area covered by water. The reason being some inverse-U shapes will be covered by water - if they are part of some larger U shape.

J.V.: Hmmm. I was about to say: "But the reverse is also true, that some U shapes will be dry - if they are part of some larger inverse-U shape...." then I realised it was false!

100% covered by lake. If you eliminate ground absorption and evaporation then every flat square stays wet. It will lose excess water to lower squares but only until it reaches a non-zero minimum. But that's silly. So you're probably assuming all squares drain instantly toward the sea unless in a bowl. But that assumption removes all rivers where water flows unobstructed to the sea. The flow may be unobstructed but it still takes time. So distance, river shape (depth, width), precipitation and flow rate matter. Messy.

Another wrinkle is that real countries don't have average elevations of 1. You could figure out what the average elevation is for Canada and then adjust your up/down random elevation function to generate a correct average, but the distribution probably matters too.

Also, given time, flowing water erosion will carve valleys and lower the exits from lakes, reducing the water level and size of those lakes.

I just wrote a long comment and then accidentally hit the back button! Mostly though I just wanted to say that I think a lot of the different between your postulated 50% and Canada's 9% can be explained by the fact that oceans are not lakes. In the case of a grid of cells with random heights, a number of cells would below sea level and connected to the ocean. If we instead force all edge cells to be above sea level, then we will be creating lakes that otherwise would have been ocean, introducing a significant "error" in our model. Although that of course depends on what it is we want to model.

Simonc:
I had a look at your code hoping to fix the mountain-in-a-lake issue, but I realized that it's not as trivial as I thought. I'm considering implementing an alternate method, but I'm not sure it's workable.

Jason: Basically, all we'd need is to find a path where all cells have a lower altitude than the starting point. The thing is, we don't define altitude. Good luck :)
S.

For some unknown reason we seem to be getting a lot of spam comments on this post, so I am closing comments. Sorry. Email me at Nick underscore Rowe at Carleton dot ca if you want to leave a comment.

The comments to this entry are closed.

• WWW