Populated places in OpenStreetMap
Some time ago i wrote about Mapping and map rendering of human settlements in OpenStreetMap where i focussed on how intermediate map scales and how OpenStreetMap data can be used to display settlements at those scales in a generalized but still individual geometric form. Here i want to extend this discussion to the small map scales where geometrically settlements (or more generally populated places) are represented merely as points.
Mapping of populated places in OpenStreetMap is basically quite simple - they are tagged place=city|town|village|hamlet|isolated_dwelling. This might seem easy enough but if you think there is not much room for errors and variations you are wrong:
First of all the place tag is applied to either nodes or polygon geometries, sometimes to both which also means there are a lot of duplicates - i.e. the same place having several objects representing it. In most cases this is when both a node and a polygon exist but sometimes there are also multiple nodes.
Then there are the other tags that can provide further information on the place. The name tag is meant to record the name of the place as it is used locally. Since this is the name that is shown in the standard OSM map style this tag is however frequently abused as a label tag and contains the labeling the mapper whants to be shown on the map. This for example includes different languages separated in varying ways around the world. Another important tag for rendering places is the capital tag which indicates if the place is the capital of an administrative unit. Also here there is an alternative way to provide this information, namely by including the feature in question with role admin_centre in the corresponding boundary relation.
Classification of populated places
Finally the most significant issue is the place classification, i.e. how the decision is made which of the place tags to apply. To explain this i have to provide a bit more context.
The structuring of human settlement into places of different size is a very basic aspect of human social behavior. The basic mechanism that regulates it is that a large settlement influences the areas around it, people living in the vicinity of a city depend on the services and infrastructure of this city to some extent - they work there, they purchase products and services there and so on. All of this depends on the primary means of transport in the area, the size of the settlement and cultural aspects of the region - however globally it does not vary that much, it is essentially a function of the population of the place.
This influence a populated place has on its surrounding is what i would call functional importance. Ideally this is behind the concept of the different place classes in OSM. place=city is meant to be used for places which due to their size provide everything people in and around them normally need and they do not generally have the need to go to another even larger place for anything. place=town on the other hand is a place that provides for most of the needs of the locals but where people depend on the nearest cities for some things. place=village only offers the most basic things and people need to defer to the closest town or city for many of their needs.
Global distribution of populated places tagged city/town/village in OpenStreetMap
This system of classification is of course ultimately quite vague and there are always borderline cases. Since as i said above how the functional importance depends on population size does not vary that much globally assigning approximate population ranges to the different place classes is possible. Here is what is documented in the OSM wiki:
- place=city at least 100000 inhabitants
- place=town 10000 inhabitants
- place=village between 1000 and 10000 inhabitants
- place=hamlet less than 100-200 inhabitants
- place=isolated_dwelling no more than two households
These are not hard cutoffs and according to the above numbers there is even a gap between village and hamlet but in a lot of cases they are at least as good a hint for classifying a place as the equally vague criteria of functional importance. To be more objective and verifiable the population number can be tagged as well of course.
The real problem in the matter comes from the fact that this classification in functional importance is used by many maps as basis for cartographic importance. The decision which places to show in a map and how prominent is based on this tagging which leads mappers to choose the tag based on if and how they want the place to be shown there.
Cartographic importance of places
At coarse map scales the primary function of rendering populated places in a map is to provide points of orientation and context for other information. Since we live in those places and visit them we recognize them in the map and can interpret the other information accordingly - like 'when i travel from place A to place B the road leads via place C' or 'place A is close to place B but they are on different sides of river C'.
To fulfill this purpose a cutoff in terms of place classification like 'only show cities' would not be very productive. The density of cities, towns etc varies a lot worldwide due to the differences in general population density. In densely populated areas of central Europe, Japan, China and India cities are pretty close to each other while elsewhere in remote areas people might need to travel quite a distance to even get to their nearest town for basic needs like to see a doctor, go to school or buy stuff. So by showing places up to a certain limit in functional importance you end up with some densely covered areas while others are essentially empty.
|Different rendering density of populated places with a place type cutoff (place=city+town) in the OSM standard style - both areas are densely mapped.|
The most common way to address this is to avoid or weaken the functional importance cutoff and to render place markers up to a certain density, with priority to the places with higher classification of course to avoid skipping them in densely covered areas. This is essentially what is done in most OpenStreetMap styles at intermediate zoom levels - you show cities or cities + towns but in most areas there is not enough room for all of them so many are skipped - which is often essentially a matter of chance. At the same time in sparsely populated areas, especially at high latitude where the varying map scale of the Mercator projection emphasizes the problem, still nearly no places are shown and to compensate mappers tag villages there with place=city.
One approach to determine a better measure of cartographic importance is to estimate a functional importance landscape and determine how much each place influences this landscape as a measure of cartographic importance. If you in addition add a special bonus for administrative centers this is a pretty good importance measure for map rendering.
Construction of the functional importance landscape and estimation of cartographic importance 'i'
When you create a function declining with distance from the place center describing the potential functional importance of the place for people at every point you can contruct the maximum of these functions for all places and the resulting function essentially describes how easily you can get access to amenities you might need at any given point on earth. If you now add a new place somewhere and its own functional importance is below this overall landscape of the other places this means it does not add significantly to the overall level of amenities available so someone in the area since there are larger places nearby with even better services. If a place protrudes from the importance landscape formed by the other places its distance from this landscape is a fairly good measure for its cartographic importance.
There are quite a number of parameters in this model that can be tuned. Most importantly the base value of functional importance of each place at its centre. This is ideally based on the population but should not be directly proportional since above a certain level further increasing population does not really add to the functional importance of a city.
The most significant shortcoming of this technique is that in reality functional importance is a multidimensional quantity. For different aspects of life people depend on larger places in different ways resulting in different importance landscapes for them. Boundaries and natural obstacles like mountains and rivers complicate things as well. But if you'd go that far you are also quite far in the domain where cultural and economical differences around the world start to have an influence and this quickly gets more complex than what can be realistically modeled.
Here some results based on processing the cities, towns and villages from OpenStreetMap.
|In North America||In Europe|
It can be seen that compared to the distribution of the original data (see the first image above) the densely populated areas are thinned out to the major cities while in more remote areas also smaller towns and villages are present. You can vary that by changing the falloff function and applying a threshold to the cartographic importance. Here another illustration for an even more thinned out set of places.
Of course the quality of the results depend a lot on the quality of the information this is based on. In particular this method relies on more or less correct population numbers or - if such are not available - accurate place classifications matching the documented population ranges. And although obvious duplicate place mappings are removed when the same place is mapped twice with different names there is no reliable way to detect this. This is the case for example in case of Sanaa, Yemen and La Habana, Cuba.
This whole processing is initially independent of the used map projection, it is based on the true distances between places. For maps with a strongly varying scale like those in Mercator-Projection it would of course also be possible to do the procrocessing in projected coordinates and this way take scale distortion into account.
Testing use in a map
You can view the first version above in detail on maps.imagico.de.
You can also test out the processed data yourself
- populated_places_osm.zip (213 kB) zipped shapefile with points including name and importance attributes
Christoph Hormann, May 2015