Openstreetmap and the mapping of natural geographic features
Openstreetmap has become an amazingly successful project for mapping of geographic features that have been created or strongly influenced by humans. With respect to those features Openstreetmap surpasses most other data sources in quality and currentness of the data. When it comes to those geographic features which are not primarily a product of human activity but are defined by nature things look different though. I will here try to shed some light on the reasons for this and point out in what way mapping nature is different from mapping human culture.
To be clear – Openstreetmap, like the name implies, is primarily a project of mapping human geography. But it should be obvious that you cannot ignore the natural (physical) geography while doing this since our everyday life, even in a large town, is much influenced by physical geography. Openstreetmap acknowledges this fact by including physical geography aspects in their keys (Key:natural, Key:waterway) The use of those keys and the quality of the physical geography information in the actual map data is an interesting matter though.
Take for example the town where i live - Freiburg. Human geography is mapped to an amazing level of detail down to the individual houses. You can also see that this mapping includes physical geography features like the rivers. But at closer analysis you realize that accuracy of the data varies. The streets and houses should be fairly accurate (the streets being mapped from GPS and the houses probably from aerial photographs). But it is visible that the rivers are not mapped to the same standard. In the area linked above the Dreisam seems to be a river of significant size while the Brugga is just an insignificant ditch. In reality the Dreisam is hardly twice the size of the Brugga at this point. Furthermore the Brugga in Openstreetmap ends a few 100m upstream before continuing after a significant gap (and with a size specified). If we move further to the countryside and have a look at the land cover mapping we can see that the forests are mapped with fairly uniform quality but without distinction if it is an deciduous or evergreen forest.
What i think these examples illustrate that physical geography is covered in Openstreetmap to the extent it is important for practical human activities. The size of a river might be important but to a much lesser extent than streets and houses. This leads me to:
Thesis 1: On average Openstreetmap mapping is done with priorities according to the importance for human activities.
So regions with a lot of people like large towns are mapped in minute detail and are kept very much up to date. Rural regions are mapped with less detail and wilderness regions are only mapped rudimentarily.
There is nothing bad about this per se but note thesis 1 implies a kind of conflict since we use maps to inform ourselves about the subject covered by the map. The less widely known a geographic feature is the more important the map tends to be. But the less known features are not the ones which are mapped with priority.
This conflict is mitigated by the fact that there is a saturation of accuracy and detail in the map due to technical limits (and to some extent by rules and conventions in OSM). As soon as feature sizes reach down to the single meter or sub meter range the primary mapping techniques used by Openstreetmap (GPS and aerial/satellite images) can no more be used and alternative techniques like measuring in the field using surveying equipment are usually not feasible. As soon as this level is reached mapping efforts will shift to those features which are not yet mapped to this level.
The above mentioned examples also show something else: there are things which are easy to map and things which are more difficult to map. To accurately map a street you only have to follow it with a GPS. With a river you can normally only do this if the river is large enough to be navigated with a boat and in that case it is usually so large you would want to map both sides of the river separately (since unlike a road it does not have a constant width in most cases). This leads me to:
Thesis 2: What is created or shaped by humans can be much better mapped by humans than what is created by nature.
This also applies to the forest mapping example above. Mapping the forest type is a difficult task - you either have to go into the forest and look at the trees or you need aerial/satellite images from different times of the year – and then you still have the difficulty to draw the line between deciduous and evergreen or a mixed type forest. And do not forget that both the composition of a forest and the course of a river usually change continuously so this information does not only need to be acquired once, it also needs to be kept up to date. These problems are emphasized by the issue described in:
Thesis 3: Earth is vast and the part of earth shaped primarily by humans is still small in comparison to those parts mostly shaped by nature.
This fact is often not seen since most of us live in regions where there is hardly anything that would qualify as untouched nature. But keep in mind that for example the combined length of all permanent rivers on earth probably by far exceeds that of all roads. Mapping the global human geography up to the saturation level defined by the primary mapping techniques used could be a realistic long term Goal for Openstreetmap. Doing this for the natural, physical geography the same way on the other hand is not feasible.
So in the end following the rule of thesis 1 is a very wise decision and i think it is even the reason for the success of Openstreetmap as a project. None the less the conflict described is real, maintaining a minimum standard of mapping quality across larger regions or even globally on all features shown by the map is immensely important for the usefulness of a map.
In one aspect related to this Openstreetmap has already made a very bold decision: Concerning relief data. Apart from point elevations of mountain peaks or other prominent points Openstreetmap specifically excludes elevation data from their database. The major reason for this is probably that since the beginning of the project (in fact long before) elevation data is almost exclusively acquired using remote sensing techniques, in most cases using completely automated processes. And when OSM started in 2004 high resolution elevation data was already freely available for large parts of the earth surface in form of the SRTM data set.
As said relief data is a special case but it has something in common with various other physical geography features: it is better represented in raster than in vector form. The same applies for example for land cover data. If you study how land use and land cover information is stored in Openstreetmap you will often find a fairly chaotic assembly of intersecting and overlapping polygons of various size. These polygons are often simply used to 'color the map' in certain parts according to certain land use but they do not usually represent certain entities like for example separately cultivated tract of land. This leads me to my first recommendation:
Recommendation 1: Some types of data necessary for a useful map cannot be well represented in the current Openstreetmap data structure so it would be important to specifically exclude them from conventional mapping and instead create the means of producing and maintaining this data in a suitable form either inside OSM or as a separate project.
Just like with relief data it might make sense for some data types to rely on external sources. Such a decision would need to be reevaluated frequently to see if the quality of the data still satisfies the needs of the project.
I pointed out various fundamental differences between human and physical geography above and based on these i am going to make two more recommendations:
Recommendation 2: While human geography features usually imply a certain level of accuracy physical geography features do not and their mapping should always include a specification of the level of detail they are mapped at to be useful.
I will probably need to explain the implied level of accuracy. When the shape of an individual building is mapped for example this implies an accuracy of at least a few meters, otherwise there would be no point mapping the shape of the building at all. Similar arguments apply for nearly all human geography features. This does not apply to natural features due to their fractal and scale independent nature.
Recommendation 3: Physical geography is usually subject to continuous change therefore all mapping of such features should include information on the time of data acquisition.
Of course such information is also useful for human geography features but it is not as necessary. A building or a road is built, possibly modified at some point or demolished. These are discrete events so it is much easier to assess which of the discrete states the mapping represents. There are of course also discrete events in nature like landslides, forest fires, volcanic eruptions etc. but they are the exception rather than the rule.
And finally directly derived from thesis 2 above:
Recommendation 4: When mapping nature with the help of remotely sensed data the use of automatic processing techniques can immensely improve efficiency of mapping compared to a fully manual approach.
I realize of course integration of automated techniques into the data acquisition process in a collaborative project like Openstreetmap would be a serious challenge if it is not regarded as a one time data import (which would limit the usefulness due to the continuous changes in nature).
Christoph Hormann, August 2012