Names and labeling in OpenStreetMap

January 19, 2023 by chris | 2 Comments

More than five years ago i wrote a blog post here on the problem of name tagging and multilingual names in OpenStreetMap. I want to give an update on the situation from my perspective and reflect a bit on what has changed since then and what has not. Also, this is meant to serve a bit as an introduction to a following blog post i plan about changes to the rendering of labels in my map style.

What i explained five years ago is essentially that the primary way through which mappers in OpenStreetMap record names, the name tag, has been central in establishing the idea of OpenStreetMap recording the local knowledge of its contributors because it is, by broad consensus, explicitly meant to record the locally used name of geographic features. Since every mapper anywhere on the planet does this in exactly the same way and sees this local name directly on the map as they have recorded it, this practice well represents the idea of jointly mapping the planet in all its diversity.

On the other hand – what i also mentioned already five years ago – this practice unfortunately comes with various problems:

it is based on the illusion that there is always a single local name.
recording the local name without any information on the language of that name creates serious issues in some cases (in particular in what is commonly known as the Han unification problem).
what mappers want to see as the label of features in the map very often is not the locally used name.

The solution from the tagging side that i proposed back then as well as various other attempts to suggest tagging ideas to address the first two of these problems did not find broader support. Both mappers and data users seemed mostly content with the status quo.

The situation today

In a nutshell: The situation today and the problems described are essentially unchanged. The problems which were already visible back in 2017 have, however, aggravated and it is visible now quite clearly that the current practice is not sustainable. Trying to categorize the use of the name tag today i get the following:

The vast majority of name tags record a single name. In most cases this is the locally used name. However, there is also a significant percentage of cases where non-local mappers record names they use or are familiar with in the name tag despite them obviously not being the locally used name (like because they are in a language not widely spoken locally). In other cases local mappers use a signed or official name (i.e. one endorsed by some sort of authority, either political authority or the owner of the feature in question) rather than the name used locally in practical communication.

All of these variants of recording a single name in the name tag that is not the locally used name are largely motivated by the fact that many data users either only interpret the name tag directly as a label tag or give it priority over other tags with names when selecting what to label. This incentivizes mappers to record the name they want to see on the map – which, depending on the individual mapper and the local cultural conventions in map design, can mean different things.

The second most common use case for the name tag after recording a single name of some kind is to record something that is not a name at all. This practice is partly fueled by the same mechanism i already discussed. The other reason why we see this is because the concept of names (or more precisely: proper names) is a highly abstract matter that quite a few mappers have difficulties with. A significant percentage of uses of the name tag for things that are not a proper name are cases where mappers record a real world label that is not showing a proper name of the feature. Often this is a brand (like in case of shops etc.), the name of a person operating a feature (that would usually be tagged with operator), address components or a classification (think of a playground where there is sign indicating use restrictions: Playground: use only allowed for children of age under 14 – and the mapper interprets the Playground as the name of the feature).

Multiple names

The third most common use case for the name tag – and we are down to a single digit percentage here, though for specific feature types and in some regions this is much larger – is the recording of multiple names in the name tag. This, likewise, is largely motivated by data users showing the name tag as is as a label and mappers in some cases seeing the display of several names in the map as the most desirable form of labeling.

There are two fundamentally different variants of this recording of multiple names in the name tag. One (and the vastly more widespread one of the two) is the recording of names in different languages. If there is not a single locally used language but several ones, a feature will usually have different names in these different languages and accordingly no single locally used name. The other (much less common) variant is when there is more than one locally used name in the single locally used language and it is not possible to verifiably determine which is the locally more widely used name.

I like to add that for both variants of this there is also the rather common case that a single locally most widely used name can be determined, but mappers – for social or political reasons – do not want to specify that and prefer to record several names as if they are equally widely used locally, even if they are not.

There has been some renewed interest more recently in the OSM community on the subject of recording multiple names in the name tag but most of the discussion dealt with the rather superficial and insignificant question how to separate the different names in that case. This is kind of like planning to pave over the cracks and discussing what shape of paving stones to use for that purpose.

In practical use the most common separators for multilingual compound names are ' / ' and ' - ' (that is slash or dash/minus with spaces around) and – for compound names where the components are distinguished through different scripts – it is also common to not have a separator at all (using just a space, which equally is used between different words of the individual names). The semicolon (;) has some use, but mostly in non-multilingual situations, in particular in cases where the different names refer to different real world features.

Despite all these inconsistencies and conflicting mapping practices there is one broad consensus about multilingual names: If there are names of multiple languages in compound name tags, each of the components should be also separately recorded in the respective name:<lang> tag. While this rule is not universally followed, there is clearly broad consensus that this is desirable.

The dilemma in map rendering

The core of the problem on the social level (and the reason why nothing of substance happened in that field for the last five years) is that the illusion of the name tag recording the single locally used name is rather convenient for data users because it allows them to label their map in a very simplistic fashion and at the same time the ability to directly control the labels in maps is something mappers have come to value and many mappers do not have the confidence

in data users to be able and willing to interpret more cleanly structured information in a way that results in good maps.
in their fellow mappers to diligently record more cleanly structured information on names (and the inconsistent use of the name tag is directly confirming that suspicion).

What i (and others) have tried in the past is presenting ideas how the problem can be addressed from the mapping side. But neither data users nor mappers turned out to be very keen of changing the status quo.

Also – both on the mapper and on the data user side – the most influential circles in the OSM community are from Western Europe and North America and primarily interested in languages using Latin script – hence have only little interest in solving the problems the lack of information on the language(s) of the name tag causes.

The dilemma for map designers, in particular in OSM-Carto, is, that it is completely clear meanwhile that the simplistic rendering of the name tags as labels for many features in OpenStreetMap is a large contributor to the bad situation as outlined above. In other words: We as map designers are part of the problem and it therefore would be prudent in a way to stop doing that. But since we want – for the reasons mentioned above and in the blog post from 2017 – to continue rendering local names in the local language everywhere on the planet and because there is no established way of record that information in OSM other than the name tag, we have no way to really do that other than displaying the name tag. And we do not want (for very good reasons i explained recently again) to actively steer mappers to change their mapping practice in a way we consider desirable.

One approach to try to overcome this stalemate between mappers and data users could be to actually present a working proof of concept showing both the benefits of actually solving the problem (instead of just paving over the cracks a bit – see above) and how a smooth transit from the status quo to such a solution could look like. And above all – to show that following such a route does not require mappers to relinquish control over mapping practice to a small elite of technical gatekeepers but would allow mappers to continue making self determined decisions and to present a solution that respects and builds on the few points described where mappers actually have consensus about how names should be mapped. This is what the next blog post is going to be about.

Imagico.de

blog

Names and labeling in OpenStreetMap

The situation today

Multiple names

The dilemma in map rendering

2 Comments

Leave a Reply Cancel reply