Imagico.de

blog

deutsch The Musaicum EU-plus satellite image mosaic

July 26, 2023
by chris
0 comments

Announcing the Musaicum EU-plus 10m resolution image of Europe

I am pleased to here introduce a new satellite image product that i have been working on for a few months. Development of this has been co-financed by Geofabrik. They are going to offer tile services for web maps based on this image in combination with the Green Marble.

Background

Most of the readers of this blog will be familiar with the Green Marble – my global satellite image product offering the highest quality rendering of the whole earth surface available at a resolution of 250m. The Green Marble is produced with a pixel statistics approach, that means an analysis of all observations available is done independently for every pixel of the image to estimate the surface color at this point. This kind of technique is very popular because of its principal simplicity and because processing can be implemented in a very efficient fashion.

But this method has two main disadvantages:

  1. It requires a significant amount of data to actually get to a point where the final product is in visual quality equal to or better than an individual good quality image. How much depends on the algorithm used and its convergence characteristics and obviously this will differ a lot between different parts of the planet. For the Green Marble version 3 for example about 1PB of raw data was processed – which means more than 100kB of data per pixel.
  2. It does not scale well with increasing spatial resolution. I discussed this matter in more depth already back in 2018. This has multiple underlying reasons, the one that is most straight away to understand is that the higher the spatial resolution is you are looking at the more volatile the content of an image becomes. This mean the higher in terms of spatial resolution you go the less you have – on average – a long term stable state of the Earth surface your pixel statistics can converge to.

Long story short: Pixel statistics work very well at a resolution of around 250m if you have a large data basis to work with. They work poorly at much higher resolutions, even if you have a lot of data (which you typically don’t – but that is a different issue). This has not prevented various companies over the last 5-10 years to invest substantial resources in naively trying pixel statistics approaches on Landsat and Sentinel-2 data – with the expected mediocre results.

The alternative to pixel statistics for aggregating satellite imagery into a homogeneous larger area visualization is using classical mosaicing techniques where individual images are essentially put together in the form of a patchwork or mosaic. If you want a concise definition: You have a classical mosaicing techniques when the color at any point in the image is in most cases (a) primarily derived from a single source image and (b) the surrounding pixels are primarily derived from the same image. This is evidently not the case for a pixel statistics process where the processing of a pixel is not correlated to that of its neighbor.

Classical mosaicing techniques are the dominant method for aggregating very high resolution satellite and aerial imagery and for high quality images based on Landsat and Sentinel-2 data. The problem here is that achieving good quality with this approach requires fairly complex processing techniques and there are certain key steps that are notoriously difficult to automatize because the quality of the results depends a lot on competent human judgement.

Hence most satellite image based visualizations using classical mosaicing techniques either are relatively poor quality (high cloud incidences, poor consistency in colors across the seams between images) or are based on fairly old data because updates are costly.

I myself have been producing images using classical mosaicing techniques for nearly 20 years now (an early discussion of this can be found on this blog in 2013) and both improved and streamlined the methods i use over the years. But also for me hand work was so far always a key component in production of these images and as a result in many cases updates are very costly to do. Therefore i had been looking for some time at strategies to eliminate the remaining manual processing steps in the production of my higher resolution mosaics without sacrificing too much in terms of quality. With the help from Geofabrik i was now able to practically implement and evaluate some of these ideas and here i am going to present and discuss the results.

The Musaicum EU-plus

The Musaicum EU-plus – click for larger version

The image

At low resolution the image looks very similar to the Green Marble – which is not astonishing since it is assembled with the same aim – to depict the local vegetation maximum and snow minimum. If you look closely you can see the appearance is not quite as uniform as the Green Marble – with some inhomogenities that are clearly non-natural. Part of this is due to the low volume of data used (put simply: Not everywhere was there a source image available in the time period used that exactly represents the vegetation maximum). Another part is that there is still room for improvement in the data processing – after all this is a first attempt.

Swiss Alps

Swiss Alps


Western Greece

Western Greece

What you will quickly notice if you look through the sample images is that there are no clouds (or almost none – with very keen eyes you might spot a few, and yes, if you systematically screen the whole image you will find some more). This is a point where the results substantially exceeded my expectations. I was hoping to accomplish a cloud incidence substantially better than what is available on the market otherwise but i was expecting it to be significantly worse than in my manually produced local mosaics. The ultimate result is pretty much at par with the manually produced local mosaics with less than about one in 100k pixels severely affected by clouds. Most of these are small, isolated convective clouds.

Vlieland, Netherlands

Vlieland, Netherlands

The focus of the project was on land area visualization so water surfaces were not a particular concern. Since waterbodies tend to be quite variable in appearance and often not in a strict seasonal pattern, the results in that domain are not always ideal, in particular rivers often change in coloring along their course quite erratically. Sun glint is also a problem at lower latitudes, in particular on smaller lakes.

Gdansk, Poland

Gdansk, Poland


Istanbul, Turkey

Istanbul, Turkey

A few words on the data used for producing the image. In analogy to the numbers i presented above for the Green Marble: The volume of original Sentinel-2 data processed for this project was about 20TB, which means it needed less than 250 bytes per pixel. This is extremely sparse considering that the volume of Sentinel-2 data collected for Europe within a single year alone is much higher. A low volume of data to process helps keeping the processing costs low and it also allows using more expensive processing techniques. And in contrast to pixel statistics methods where adding more data always helps, there is no advantage per se with classical mosaicing techniques to use more data, it is more a quality over quantity thing.

What has been more challenging is that i wanted to keep the time-frame from which to source data from relatively short. Preferably just three years (2020-2022), where necessary another year (2019) and only in rare cases also data from 2018 is used. In areas where the earth surface appearance is very volatile – either across the seasonal cycle or between years – this makes it difficult to produce homogeneous results over larger areas.

Paris, France

Paris, France


Cordoba, Spain

Cordoba, Spain

What you have to work with

One of the things that made work on this quite substantially harder than you would naively expect it to be is the poor quality of some of the data supplied.

Sentinel-2 data can be obtained in two different forms:

  • L1C data – which is the original Top-of-Atmosphere reflectance values as recorded by the satellite
  • L2A data – which is as estimate of the surface reflectance data based on the TOA measurements

Now most experienced users of satellite imagery will understand that the L2A data is – as i characterized it – just an estimate of the surface reflectance. And while this should reduce the variance due to the variable influence of the Earth atmosphere it will also introduce variance in the form of various forms of noise and systematic and random errors in the estimate. What, however, most data users will not realize is that the Sentinel-2 L2A data distributed also attempts to compensates for illumination differences (shading) and that this compensation – to put it bluntly – is incredibly bad. So bad that it is practically not usable for visualization purposes. Here an example – larger version is linked:

S2A L1C from 2021-07-17

S2A L1C from 2021-07-17


S2A L2A from 2021-07-17

S2A L2A from 2021-07-17

For comparison here the new Europe mosaic (which in the default rendering is an atmosphere but not shading compensated version) and my own shading compensated version. For most of the sample area the mosaic is based on the same recording – except for the top right where the July image contains residual snow so the mosaic uses a later season image.

The new Europe mosaic - Tena Valley, Pyrenees

The new Europe mosaic – Tena Valley, Pyrenees


Shading compensated version

Shading compensated version

The overall color difference is not the main point here (the tone mapping i applied to the L1C/L2A data is somewhat arbitrary). The main point is that the illumination compensation in the L2A data massively over-compensates, leading to the shaded slopes often being brighter than the sun facing slopes. Also it is geometrically fairly inaccurate, leading to a strong emphasis of ridges and the introduction of substantial amounts of high frequency noise.

There seem to be quite a few people using Sentinel-2 L2A data in visualization applications. That is not a good idea. You should only use shading compensated imagery in visualizations if you know exactly what you are doing and in that case you should use a proper method and not the hatchet job offered by ESA here.

I have discussed the matter of shading compensation, in particular for use in 3d rendering, previously. A shading compensated version is available for the Musaicum EU-plus as well – but i have not had the time for a proper evaluation yet. It is available on request.

What remains

As i have mentioned earlier – this is a first attempt at a more highly automated classical mosaicing process. As often with this kind of work many things remain unclear during process development and only become clear once you run the process at scale. This might sound a bit like undue perfectionism considering the results are pretty good. But this is not just about the quality of the results but also about the robustness and efficiency of the process. As a general rule of thumb for a project like this i would say: Putting as much work into process development after the first use at scale as before is reasonable if your aim is to make full use of the potential of the method used.

Svartisen, Norway

Svartisen, Norway

Where to go from here

One thing that i am sure quite a few of my readers are going to ask is: Does this also work beyond Europe? In principle the answer is yes, but there is more to it than just running it on another volume of data. Quite a significant part of the development work that went into this project was for tuning and adjusting methods and parameters for the specific settings in terms of climate and surface characteristics that can be found in the processed area. I know from experience with the Green Marble that the variety of different settings on Earth is larger than people usually expect. This needs to be taken into account and that was not part of this project.

There are a number of other additions and extensions of the project that i would like to work on, for example producing a vegetation map similar to the ones available for various local mosaics is is on the list. And of course over time there will be the question of updating the mosaic with newer data. I don’t know if and when i will have the capacity to do any of this. If any of the readers are interested in supporting further work on any of this please feel welcome to get in touch.

You can find the product description and more sample images on the Musaicum EU-plus product page.

Many thanks to Geofabrik for co-financing this project. If you are interested in tile services based on the the Musaicum EU-plus or the Green Marble, people at Geofabrik will be happy to help you. If you are interested in licensing the image for other purposes you are welcome to contact me.

Tödi, Switzerland

Tödi, Switzerland


Zakynthos, Greece

Zakynthos, Greece

July 26, 2023
by chris
5 Comments

Satellite image sources for OpenStreetMap mapping

There has been quite a bit of fuzz in the OSM community recently (like here – warning: link goes to patronizing and broken web interface) because one of the image layers with semi-global coverage that had so far been widely used by mappers as a source for remote (armchair-) mapping has been turned off.

Calls for the authorities (in other words: the OSMF) to fix this were quick and the OSMF board seems to try persuading the image provider to restore the status quo ante. Sadly, however, the event seems to have not initiated any larger scale reflection within the OSM community on its dependency on proprietary data providers, which has increased significantly over the past years.

In wealthy parts of the world with active mapper communities local mappers have over the years invested significant work into collecting suitable local image sources (i.e. aerial images) and obtained permission to use those for mapping in OpenStreetMap from their providers. This not only applies to Europe and North America but also other parts of the world like Japan and parts of South America for example. This is an impressive achievement and highly useful for the practical work of mappers. And because all these image sources are independently produced and provided by different local image providers there is no problematic large scale dependency on a single image source because of that.

But this only applies to a rather limited part of the Earth land surface. For the rest OpenStreetMap currently largely depends on a single satellite image provider (Maxar) and image layers based on Maxar imagery provided by a small number of US Corporations (Microsoft/Bing, Esri, Mapbox and – until recently – Maxar itself). This problem is aggravated by the fact that it is in particular those parts of the world where no local aerial image sources are available where OSM currently lacks significantly in mappers with local knowledge and over-proportionally depends on remote mapping. In addition all of the listed image layers to varying extent have a focus on those parts of the world where other local image sources are available as well and elsewhere often have more patchy and lower quality coverage.

I have pointed out in the past that an important avenue for the OSM community to mitigate this dependency is to focus more on using open data satellite imagery. Even if this cannot fully replace commercial images, open data satellite imagery is currently severely underused in OSM, largely because of the lack of convenient practical availability of high quality images from such sources to mappers.

Of course there are other options the OSM community could try to decrease the current dependency on a single imagery provider:

  • invest in recruiting mappers with local knowledge in larger parts of the globe (which would of course require the English speaking influential parts of the OSM community to open up more to true cultural diversity).
  • invest in the capabilities of mappers to map in high quality in the absence of high spatial resolution imagery. It is quite remarkable how dependent even mappers with local knowledge mapping using on-the-ground surveying are often on image availability. Doing so would consist both in educating mappers in techniques that do not rely on high spatial resolution imagery and equipment that allows precision mapping independent of imagery.
  • diversify the supply of commercial satellite imagery. In the resolution class of Maxar (0.5m GSD or better) there is only a single other provider at the moment (Airbus/CNES) but in the slightly lower resolution range (1m GSD or better) there are quite a few more. I am not aware of any initiative from the OSM community to organize access for mappers to imagery from any of these sources on a larger scale.
  • lobbying for opening aerial imagery sources in parts of the world where this exists but is not available for mapping in OSM yet.
  • invest in production of open data aerial imagery, in particular recorded by UAV.
  • better availability of alternatives to optical imagery for mapping in OSM. There are various parts of the world where no very high resolution optical imagery is available for OSM but other open data sources are – like elevation data in polar regions.

Making open data imagery more accessible for mappers

I have tried with my OSM images for mapping over the last years to demonstrate how competent selection and high quality processing of open data satellite imagery can be useful for mapping in OpenStreetMap. I have added some more images of the Antarctic now, substantially reducing the gaps in coverage of ice free parts of the Antarctic.

I have chosen the Antarctic for this in particular because the proprietary higher resolution image layers tend to not have coverage there (or are very patchy). And also because of the high contrasts between ice and ice free areas the tone mapping used by many of the global image layers works poorly in these regions. Unfortunately, mappers in OSM seem to have a tendency to almost universally pick higher spatial resolution images over lower spatial resolution images, even if those are in all other aspects substantially worse (like more than ten years old with seasonal snow cover or poor processing). In other words: Larger scale use of open data satellite imagery in OSM is not only hampered by difficult access to such imagery in high quality, it is also made difficult by the lack of knowledge on the side of many armchair mappers for competent assessment and selection of the best image source for a specific mapping task.

Change in terms of use

Finally i have also changed my terms of use for the image layers for mapping i provide. This is owed to the increasing use of self adjusting algorithms (a.k.a. artificial intelligence) – in general and for satellite image interpretation in particular. The terms i have chosen mean that you can use my images with such algorithms only if the algorithms are fully open source – including the training data and training algorithms. I consider this a prudent choice considering the widespread trends towards neo-feudalism in the world of digital services in general and AI methods in particular. OpenStreetMap can only maintain its paradigm of a self determined cooperation of people with local knowledge sharing this knowledge with one another and the rest of the world (a map by the people for the people) if mappers retain full control over the methods of mapping. And that includes the algorithms used for image analysis.

Crevasse rendering in OSM based maps

July 5, 2023
by chris
1 Comment

Drawing the line (3) – beyond simple line patterns

In this series of blog posts (see part 1 and part 2) i have so far discussed the limitations of current digital map design frameworks regarding the use of line signatures compared to what was state of the art in the pre-digital age. And i provided some examples of what design issues result from the technical constraints we are faced with and showed various methods that can be used to work around these technical constraints to some extent.

All of this was so far about either constant or periodically repeating line signatures. Here i want to show that this is not the end of options to use line signatures in maps. Like the previously discussed designs this is also available in the AC-Style for everyone to use and study.

The feature i am going to look at here is crevasses. Crevasses are so far not a very widely mapped feature in OpenStreetMap, they are kind of a niche interest only relatively few mappers work on. This is partly because mappers sometimes think that, since individual crevasses are very volatile, it is pointless to map them. They often don’t realize that, while individual crevasses move and change fast, the location where and the patterns in which they occur in are typically very stable, often much more stable than the extent of the glacier itself. Mappers who map crevasses in OpenStreetMap use both polygons and linear ways to map these and both can be reasonable choices depending on circumstances. For providing constructive feedback it is therefore prudent that a rendering solution ensures a decent depiction of both variants.

Crevasses in the Antarctic

Crevasses in the Antarctic

Crevasses are an interesting and challenging element in map design because they occur in a very broad range of sizes – the width of a crevasse can be from less than a meter to dozens of meters and the length from a few meters to many kilometers. And – as already hinted at – they form complex and in most cases fairly stable patterns that are often highly characteristic and significant for navigation on the glacier.

Because of this, rendering of crevasses is an important element in many traditional maps of glaciated regions and many techniques have been developed to depict crevasses at different map scales. Here are a few examples:

Crevasses in Swiss 1:25k topographic map

Crevasses in Swiss 1:25k topographic map

Crevasses in French 1:25k topographic map

Crevasses in French 1:25k topographic map

Crevasses in Soviet 1:50k topographic map

Crevasses in Soviet 1:50k topographic map

For the design concept i am going to present here i took cues also from early modern maps of the Antarctic, in particular:

Australian 1:100k Aker Peaks from 1966

Australian 1:100k Aker Peaks from 1966 – source

New Zealand 1:250k provisional edition Cape Hallett from 1967

New Zealand 1:250k provisional edition Cape Hallett from 1967 – source

The basic high zoom level design for single crevasses represented with polygons and linear ways is shown here (together with the other linear features common in this context – see the previous post)

Crevasses and other glacier details at z20

Crevasses and other glacier details at z19

Crevasses and other glacier details at z18

Crevasses and other glacier details at z17

Crevasses and other glacier details at z16

Crevasses and other glacier details at z15

Crevasses and other glacier details at z14

Crevasses and other glacier details at z13

It uses a relatively simple line pattern as the main component – but not on the linear way representing the crevasse, but on a constructed geometry in the form of an open crack along the line of the crevasse, tapered towards the ends. The width in which this geometry is drawn depends on the zoom level, the length of the crevasse and potentially the tagged width. The level of detail of the drawing (with line pattern on the outline or simple constant width line, with center-line or not – or even just a simple center-line) depends on the drawing width.

Crevasse drawing width depending on length at z20

Crevasse drawing width depending on length at z19

Crevasse drawing width depending on length at z18

Crevasse drawing width depending on length at z17

Crevasse drawing width depending on length at z16

Crevasse drawing width depending on length at z15

Crevasse drawing width depending on length at z14

Crevasse ground unit rendering based on tagged width at z18

Crevasse ground unit rendering based on tagged width at z18

The real key to a harmonic design, however, is how crossing crevasses and combinations of polygon and linear way mapping are drawn. This is shown here:

Rendering of intersecting crevasses at z18

Rendering of intersecting crevasses at z18 (link shows double resolution version)

So far, most of the crevasse mapping in OpenStreetMap makes relatively limited use of these options. Here are a few examples:

Séracs du Géant at z15

crevasses mapped with polygons – Séracs du Géant at z15


Séracs du Géant at z16

crevasses mapped with polygons – Séracs du Géant at z16


Ghiacciaio del Dôme at z16

crevasses mapped with linear ways – Ghiacciaio del Dôme at z16

Conclusions

What i have shown here with the crevasses is an intermediate design approach between the use of higher level features discussed in the previous post and the fully manual drawing of line signatures as i have shown it for tree rows in the past. It uses manually constructed geometries but visualizes them with the help of line patterns. Practical usefulness of this based on OpenStreetMap data is so far rather limited due to the very patchy mapping of crevasses in OpenStreetMap.

Line patterns for OpenStreetMap based map styles

July 3, 2023
by chris
1 Comment

Drawing the line (2) – line patterns for OpenStreetMap based map styles

In this second part of my short series of blog posts on using line signatures in digital maps (see first part) i want to introduce some work i had partly already done some time ago at the Karlsruhe Hack Weekend to add support for some additional line features through line patterns to the AC-Style, as well as adding differentiated rendering of barriers.

Parts of these changes are concerning tags which are widely and consistently used in OpenStreetMap and therefore their rendering is of current practical use. Others are more proactive design of things not yet widely mapped in OpenStreetMap so far, which i show here for demonstrating what could be part of a feature rich map and as test cases to demonstrate certain techniques.

Traditionally, line patterns are used in OSM-Carto for cliffs (natural=cliff) and embankments (man_made=embankments) as well as more recently also for ridges (natural=ridge) and aretes (natural=arete). Most linear barriers (barrier=chain, barrier=ditch, barrier=fence, barrier=guard_rail, barrier=handrail, barrier=retaining_wall, barrier=wall) are uniformly rendered with a simple gray line, only barrier=hedge and barrier=city_wall/historic=citywalls have their own, more specific line signatures.

Natural lines and barrier line signatures in OSM-Carto at z18

Natural lines and barrier line signatures in OSM-Carto at z18

In the AC-Style i had already added support for implicit embankments (embankment=yes/cutting=yes) on roads/waterways (see here) and – as part of my tree rendering work – differentiated rendering of hedges. I had also added support for rendering mapped entrances on barriers (barrier=entrance) – but i have not discussed this feature here so far, so i will quickly discuss this at the end of this post as well.

Additional and improved natural linear features

The starting point for my work at the Hack Weekend on the patterns was that i had previously added support for rendering natural=earth_bank – using the line pattern for cliffs in a brown color. But i was never satisfied with that because (a) introducing yet another color is generally a poor choice for adding a new feature, given how confusingly many colors there are already in use (and brown lines being otherwise used for track roads) and (b) because the differentiation by color is misleading as it implies a differentiation mainly by surface material – while the tag natural=earth_bank is used much more broadly in terms of surface topography than natural=cliff (which is exclusively for vertical or near vertical structures) and a good design should reflect that.

What i did instead was to use an asymmetric version of the ridge pattern, which is meant to imply that natural=earth_bank and natural=cliff form a bit of a pair, in a fashion similar to natural=ridge and natural=arete. I did make use of color by differentiating earth_bank=grassy_steep_slope with a green instead of a gray color.

I also added support for rendering natural=gully with a directed symmetric pattern.

Natural features line signatures in the AC-Style at z19

Natural features line signatures in the AC-Style at z18

Natural features line signatures in the AC-Style at z17

zNatural features line signatures in the AC-Style at 16

Natural features line signatures in the AC-Style at z15

Natural features line signatures in the AC-Style at z14

Natural features line signatures in the AC-Style at z13

So much for new linear natural features. What you can also see in the above sample rendering is that i added special rendering of unconnected line starts and ends for natural=ridge and of line starts for natural=gully. There is no built-in support for that, so this is manual work using point symbolizers. Line starts are relatively easy – you just have to design a half-circular closing for the line pattern on the left side and render it with the proper orientation on the starting point. Line ends are more difficult because the pattern can – due to the way it is rendered by Mapnik – end on any phase of the pattern image and therefore will not precisely match a static line end cap image. For a fine structure pattern, like the one used for ridges, that is not much of a problem but for other patterns different solutions would need to be employed.

As i mentioned in the first part of the blog post, the way line patterns in Mapnik work is by piecewise linear mapping of the pattern image onto the linestring. This rules out anything like rounded line joins. The problem is that with many pattern images corners in the linestring at the wrong location can lead to odd discontinuities. You can widely observe this with the cliff pattern where, when a dent of the cliff pattern is directly located at a corner of the cliff, it frequently produces artefacts.

What you can do to mitigate this problem is to (a) offset the curve towards the center of the pattern rather than to center the pattern image on the centerline (like it is done in OSM-Carto) in case of asymmetric patterns like for cliffs and (b) minimally round the corners by using ST_OffsetCurve(). This avoids individual sharp corners. It does not allow for true round line joins, but it can help avoiding the most severe artefacts.

Plain line pattern (left) and with line join tuning and artefact reduction techniques (right)

Plain line pattern (left) and with line join tuning and artefact reduction techniques (right) – double resolution version

Another approach that can help with getting more harmonic corner rendering is to split the line signature into several line patterns and render them independently. The less wide the pattern image is, the less do corners in the line lead to disruptions of the pattern image. This is the approach i used for natural=gully where the left and right side of the line signature are rendered independently. This means that the left and right side are, in general, not in sync along the course of the line. How much of a problem that is depends on the line signature used.

Another problem with line patterns are junctions between several lines. Again – Mapnik does not have any built-in support for that. The patterns for natural=ridge and natural=arete were specifically designed to work gracefully in such situations but this does not apply equally to natural=gully where junctions are a common occurrence. Therefore, cutting the lines with the connecting geometries in a similar way as i have done with sidewalks and implicit embankments is required.

Junctions with line pattern - rendered as is for ridge and arete, with manual management based on separate rendering of left and right side for gully

Junctions with line pattern – rendered as is for ridge and arete, with manual management based on separate rendering of left and right side for gully – double resolution version

Differentiated rendering of barriers

The other group of linear features i have looked at are constructed barriers tagged with barrier=*. Here i implemented the following:

  • The differentiated rendering of barrier=fence, barrier=guard_rail, barrier=wall, barrier=retaining_wall and barrier=ditch at the higher zoom levels
  • The differentiated rendering of barrier=wall and barrier=retaining_wall with height<=0.5 - because they are in practical meaning substantially different from higher walls.
  • The ground unit rendering of barrier=wall, barrier=retaining_wall, barrier=ditch and barrier=city_wall/historic=citywalls according to tagged width=*

Here is how this looks like:

Differentiated rendering of barriers with ground unit rendering based on width tag - z19

Differentiated rendering of barriers with ground unit rendering based on width tag - z18

Differentiated rendering of barriers with ground unit rendering based on width tag - z17

Differentiated rendering of barriers with ground unit rendering based on width tag - z16

Barrier styling variants for low height walls - z20

Barrier styling variants for low height walls - z19

Barrier styling variants for low height walls - z18

Barrier styling variants for low height walls - z17

Barrier styling variants for low height walls - z16

The way variable width rendering is implemented differs between the different barrier types. barrier=wall and barrier=retaining_wall use a combination of different solid and dashed lines. barrier=city_wall/historic=citywalls in addition uses, for the distinct rendering of the outside face of the wall, a line pattern. And barrier=ditch uses a parametrized line pattern. This works somewhat similar to the tree symbols technique - a sequence of line pattern SVGs is generated via script for different line widths based on a parametrized pattern definition in SQL code.

By the way - both barrier=ditch and natural=gully work decently together with a waterway:

Ditch and gully with waterways - z19

Ditch and gully with waterways - z18

Ditch and gully with waterways - z17

Ditch and gully with waterways - z16

Further details

I made some adjustments to the rendering of dykes - which i had already introduced as part of rendering implicit embankments. They are now rendered with a variable width of the line pattern similar to the method used for barrier=ditch. They are also rendered slightly asymmetrically with a wider slope on the right side and a more narrow slope on the left side. Dykes typically have an asymmetric profile with the wider and more gentle slope on the seaward side. Unfortunately, so far no convention is established in OpenStreetMap regarding the directionality of dyke mapping. So this is more a demonstration of what could be done with consistent mapping. For this to work, the line patterns for dykes are rendered separately for both sides, similar to how it is done with natural=gully.

Rendering of dykes with variable width and in comparison to implicit embankments at z18

Rendering of dykes with variable width and in comparison to implicit embankments at z18

Of course for wider dykes changes in direction will not look that good. Same for combination with roads/paths with or without implicit embankments.

Dyke rendering with context at z18

Dyke rendering with context at z18

I promised above that i would quickly discuss the rendering of barrier=entrance - which is not a new feature but has been around in the AC-Style for some time. barrier=entrance is used to map an entrance within an otherwise continuous barrier. In a nutshell: It is similar to barrier=gate, just without the gate. To render this you need to interrupt the rendering of the barrier at the node tagged barrier=entrance. And that interruption needs to be adjusted in width to the drawing width of the road/path crossing the barrier there. In addition i mark the ends of the barrier around the entrance with small dots - making it clearer that this is an explicitly mapped entrance and not a gap in barrier mapping. Here is how this looks like for various types of barriers:

barrier=entrance in combination with various barrier and road types at z19

barrier=entrance in combination with various barrier and road types at z19

Finally: I added support for dedicated rendering of natural=cliff with surface=ice. Background for this is that in polar region ice cliffs are a widespread feature, in particular at coasts. The majority of the Antarctic coast is formed by ice cliffs. Here is an example to give you an idea.

Typical ice cliff coast in the Antarctic

Typical ice cliff coast in the Antarctic

This is not widely mapped explicitly with natural=cliff so far but this would clearly be a significant aspect of more detailed mapping of polar regions.

Here is how the new rendering looks like in typical contexts - both within a glaciated area and at their edge towards the three different colors of waterbodies in the style.

Ice cliffs (natural=cliff + surface=ice) in their expected color contexts at z18

Ice cliffs (natural=cliff + surface=ice) in their expected color contexts at z18

Practical Examples

And here are a few examples with real world data of some of the features discussed.

natural=gully at z16

natural=gully at z16


natural=earth_bank and barriers at z18

natural=earth_bank and barriers at z18


Barriers at z19

Barriers at z19


Barriers at z19

Barriers at z19


Citywalls and other barriers at z19

Citywalls and other barriers at z19

Implementation notes

Like i already did with the point symbols and the polygon patterns i moved to colorizing the line pattern images via script to allow color adjustments to be applied consistently without manually editing several SVGs. The script doing that (generate_line_patterns.py) also implements the generation of line width sequences for the parametrized line patterns like used for barrier=ditch.

Furthermore, i also added a feature to the scripts processing the symbols and patterns to generate contact sheets. Those can be found in symbols/README.md. Those are not rendered with Mapnik but with ImageMagick so they can both be used for quick visual verification in symbol design and for evaluating issues in Mapnik rendering by providing an independent reference.

Summary and Conclusions

What i showed in this blog post is how line patterns and other means available in Mapnik and CartoCSS can be used to generate complex line signatures beyond the naive use of line patterns, while avoiding to do a fully manual implementation like i did for the trees. I showed various techniques that can be helpful, in particular

  • manual drawing of line cap images to supplement a line pattern
  • adjusting the centerline location and joins of a line pattern based line to mitigate common artefacts
  • splitting a complex line signature into components for better quality rendering of corners
  • ground unit rendering in combination with complex line signatures
  • techniques for rendering junctions with line patterns
  • examples of various ideas for line signature design for different purposes

Still - i hope it also became visible that quality and design possibilities are still massively limited, even when using these techniques, compared to the state of the art in pre-digital map design.

It is only possible to go substantially beyond what i showed here with common map rendering frameworks if you draw things manually on the micro-element level without relying on built-in higher level features from the renderer like line patterns. I took this approach in case of trees and tree rows in the past but i deliberately did not want to go this route here and instead wanted to explore what can be done by using the built-in higher level means.

In the third part i am going to take this a bit further on a specific class of features.

DAV map 'Brenta' fom 1908

July 1, 2023
by chris
1 Comment

Drawing the line – on the use of line signatures in maps

This blog post is the first part of a short series of blog posts discussing the use of lines in maps, in particular of more complex line signatures.

Let us first consider lines as a micro-element in map design, which includes not only a simple hairline or a line of constant width but also anything that in classical map drawing is produced with a directed stroke with a drawing tool like a pen, brush or nib – or with a similar stroke with a cutting tool when the map is produced by directly engraving a printing plate.

Lines in this sense are traditionally the most important and most widespread design element used in maps. This applies in particular to the golden age of cartography in the 19th and early 20th century where map production techniques typically limited the use of color. Line drawing was extensively used not only for depicting things like rivers, roads and boundaries but also in the form of hachures and later contour lines to depict relief and through water lines to depict water areas for example. Here are some examples from the late 19th and early 20th century.

Topographic Atlas of Switzerland 1:50k 1888

Topographic Atlas of Switzerland 1:50k 1888

Lake Titicaca by Rafael E. Baluarte, Geographical Society of Lima 1893

Lake Titicaca by Rafael E. Baluarte, Geographical Society of Lima, 1893

Pre-1945 German TK25 sheet 3214 Vestrup

Pre-1945 German TK25 sheet 3214 Vestrup

But this extensive use of lines had a substantial drawback: It made map production very work intensive. Every stroke in the line work in a map was the manual work of a highly qualified map designer.

Daniel P. Huffman’s discussion of water lines cites an estimate for example that drawing of water lines alone accounted for 18 percent of the production costs in some cases.

This and the changes in map printing technology allowing the use of a wider range of colors and color intensities through half-toning significantly decreased the use of lines as design elements in favor of more extensive use of color in the second half of the 20th century but especially with the raise of digital technologies in map production.

The move to digital map production techniques was severely limiting the use of lines as map design elements because initially the digital technologies typically used

  • only allowed for simple constant width lines following a sequence of straight segments (a linestring) with the only design parameters typically being line cap and line join shape and potentially some form of dashing following a fixed pattern along the course of the line.
  • required the drawing of these simple lines to be directly tied to and based on a linestring or polygon outline geometry.

In other words: The use of lines in map design has over the last 50 years essentially been reduced to lines as a macro-element of constant width, optionally with a regular dashing pattern of some sort. This development was the combined result of cost cutting in map production and the loss of capabilities in the move from more flexible analog map production technology to more simplistic and more limited digital systems.


Pre-Digital (left) and current digitally produced (right) German TK25 of Mont Royal (see larger interactive map)


Pre-Digital (left) and current digitally produced (right) German TK25 of Lilienstein (see larger interactive map)

History of line drawing in OSM-Carto

Among digital map rendering frameworks the one used by OSM-CartoMapnik – has from the early days on been a bit more flexible than other frameworks by providing the possibility to render line patterns. Line patterns are images to be drawn along a line string geometry as a line signature. Initially, these could only be raster images – which severely limited use due to the inevitable lack of precision in rendering. Still – this substantially increased design options beyond the basic line of constant width paradigm. In other words: It substantially widened the ability to work with lines as a macro-element, that is working with linear geometries within cartographic data and drawing something (which can be, but does not necessarily have to be lines as a micro-element as discussed above) along the course of a line.

Line pattern rendering principle in Mapnik

Line pattern rendering principle in Mapnik

OSM-Carto’s main use of line patterns initially was the rendering of cliffs and embankments.

Later versions of Mapnik allowed using SVGs for that purpose, greatly improving the quality of line pattern rendering. For cliffs and embankments in OSM-Carto this was adopted in 2017.

Cliffs and embankments in OSM-Carto

Cliffs and embankments in OSM-Carto

Despite their flexibility and usefulness, line patterns in Mapnik are a fairly limited way of defining more complex line signatures. The pattern image is mechanically placed and cut in a piecewise linear fashion along the course of the line string with no possibility to define line joins or line caps or other ways to adjust the line pattern locally to the geometry. In the rendering of tree rows i showed here some time ago for example i had to manually render the line signature from individual tree symbols to get the desired design. When introducing the rendering of ridges/aretes (based on a design idea that was originally developed for OSM-Carto by someone else) in my style i used line patterns but cut out peaks from the line for a better readable rendering. The pattern idea in 2019 was adopted by OSM-Carto – but without the cutting out of peaks.

Ridge and arete rendering in OSM-Carto

Ridge and arete rendering in OSM-Carto


Ridge and arete rendering in the Alternative-colors map style

Ridge and arete rendering in the Alternative-colors map style

More recently Mapnik got another feature improving the rendering of complex line signatures – that is the use of 2d patterns (like the ones used for polygon fills) as a drawing signature of lines. In contrast to line patterns where the pattern image is rendered along the line, the 2d pattern rendering uses a tiled 2d pattern not rotated along the direction of the line as fill. This simplifies use of 2d patterns for line rendering which previously had required use of compositioning operations. OSM-Carto uses this feature for unpaved roads now.

Unpaved roads rendering with 2d structure pattern in OSM-Carto

Unpaved roads rendering with 2d structure pattern in OSM-Carto

Apart from the shortcomings w.r.t. rendering more complex line design concepts (where Mapnik, however, is still not worse that most other map rendering tools) there is also the severe and annoying limitation that the line caps of a line as a whole also define the line caps of the dashing pattern used and both cannot be adjusted independently from one another. But that is just a side note here.

Conclusion

I think this illustrates a bit on what very basic (not to say primitive) level line drawing in digital maps often still is these days compared to what was the state of the art in pre-digital map production. This very basic level is largely owed to the technical limitation of map rendering frameworks – but also to the limited interest of many map designers to actually explore the technical capabilities that exist. In the next blog post i am going to show a bit what can be done with the means that line patterns offer in OpenStreetMap based map styles.

Green Marble image with bathymetry shading

May 28, 2023
by chris
1 Comment

Interactive maps update

I have – after many years – made an update to the generalized map rendering demonstration on maps.imagico.de. Since i have first shown this demo i made several improvements to the underlying generalization processes, some of which i have discussed here on this blog in the past. But running all of these processes on a global level in a synchronized fashion is always a lot of work. For good results the different components shown in the map – rivers, lakes, coastlines, glaciers, builtup areas and relief – need to be be harmonized in processing with the others.

In addition to the various improvements made over the years i put some more work into improving the relief generalization methods for small scale applications. Here a comparison of the raw relief shading with two different generalization parameter sets at z6.


Two different relief generalization parameter sets for comparison

The waterbody depiction is – as discussed before – based on a detailed analysis of the global river network. Here two visualizations of that to give an idea. The selection and rating of river segments to show at a certain scale is based on that analysis.

Water network analysis for Europe and western Asia

Water network analysis for Europe and western Asia – click for larger area

Detail of water network analysis in France

Detail of water network analysis in France – click for larger area

Data processing is available up to zoom level 10 for the waterbodies – for the glaciers and builtup areas also further. For regional subsets i have done the water network analysis and processing also up to z13. The demo map is now extended to zoom level 8 – previously it went to zoom level 7 only. Here a few examples with links to the interactive map.

Southern Peru with topography style at z5

Southern Peru with topography style at z5

Caucasus mountains with topography style at z8

Caucasus mountains with topography style at z8

East Africa with landcover style at z5

East Africa with landcover style at z5

Southern Patagonia with landcover style at z8

Southern Patagonia with landcover style at z8

So far i only updated the Mercator map in the landcover and topography layers. Other layers and maps will be updated if time is available.

Satellite image with bathymetry shading

In addition to the updates of the generalized map rendering demonstration i have worked on a variant of the satellite image layers with bathymetry depiction instead of ocean color rendering. This kind of rendering is fairly popular in interactive maps you can find in various popular map services.

The origin of that trend is that satellite image mosaics tend to not provide a good quality water depiction. My Green Marble is essentially the only currently available product on the market that provides a global ocean depiction suitable for high quality visualization. To avoid having a lot of empty space in their satellite image layers the map producers opted for depicting the bathymetry – for which data was and is routinely available. How exactly the bathymetry is rendered varies (and this is actually an interesting topic to study). But in principle this idea, which was originally born out of the quandary of having no good image data, turned out to be something that can be made to work quite well.

The main difficulty with that is where to draw the line between the bathymetry depiction and the optical imagery shown for land areas. If you use some coastline mask for that you end up cutting away meaningful imagery – for where the coastline mask is inaccurate as well as for tidal and shallow water areas like reefs where the image data is meaningful. To deal with that most popular map services use a manually drawn mask – which sometimes can be a bit curious in the arbitrariness of choices made.

I have the advantage that with the Green Marble i have high quality data for the oceans that allows me to transit from optical imagery depiction to bathymetry rendering based on globally consistent objective criteria. I have produced two variants of this – one with and one without sea ice depiction. Both can now be studied in the demo map.

Also here a few examples of selected areas with links to the interactive map.

Sicily in Green Marble with bathymetry depiction

Sicily in Green Marble with bathymetry depiction

Bahama Banks in Green Marble with bathymetry depiction

Bahama Banks in Green Marble with bathymetry depiction

Norway coast in Green Marble with bathymetry depiction

Norway coast in Green Marble with bathymetry depiction

Hawaii in Green Marble with bathymetry depiction

Hawaii in Green Marble with bathymetry depiction

All of these are available for licensing and for producing custom visualizations and custom data processing using the techniques demonstrated. Contact me if you are interested.

Competing priorities - symbols and labels in rule based map rendering

May 17, 2023
by chris
0 comments

Competing priorities – symbols and labels in rule based map rendering

I have a significant backlog now of changes i made on the alternative-colors style that i have not discussed here yet. This blog post is about the largest and most significant of these changes, related to the way point symbols and labels are rendered.

Point symbol and label rendering in OSM-Carto has been in a non-satisfying state for a long time. This is largely because almost no fundamental work has been done on this part of the style for many years while the complexity in the form of the variety of different label and symbol types has grown continuously over time. Or in other words: There has been continuous pressure to add new features in the form of labels and symbols to the style combined with the almost complete absence of interest to actually rework the system of label and symbol rendering in a way that actually scales and is able to handle this complexity.

The core of the problem discussed by this blog post is the depiction of blocking elements in an automatically rendered map style. Blocking elements means graphical elements that – as a matter of principle – are not supposed to be overlapping with each other. This is the case for all labels in any map as well as typically also most pictorial symbols used.

Different methods exist to implement the requirement for elements to not overlap. The most significant one is optimizing the placement of symbols and labels accordingly. This whole field consisting of many different techniques i will not cover in this post.

Most digital rule based map rendering systems, however, are fairly primitive when it comes to handling of blocking elements. In those cases the only way of enforcing the non-overlap is by not rendering elements that would overlap other elements which have already been rendered. There is typically no support for cutting out/masking or otherwise modifying symbols to avoid overlaps with others (something i demonstrated with tree symbols in a previous post) and options to tie the rendering of graphical elements to one another are typically very limited. One of the oldest open issues of OSM-Carto on the issue tracker is dealing with exactly that limitation. I am also going to discuss this here although i will not present a solution because that is not possible without support from the rendering software.

The OSM-Carto point symbol and label system

The approach to point symbol and label placement used by OSM-Carto in light of these limitations is by separating symbols and labels into two completely independent layers. These two layers use the same SQL code meanwhile (which makes maintenance a bit easier with this kind of long queries).

This means essentially the style places all point symbols meant to be shown at a specific zoom level as far as this is possible without overlaps and then it places all labels that still fit in between the symbols. That means point symbols universally have priority over labels. And if there is only space for the symbol and not the corresponding label then only the symbol is rendered.

Conflicting symbols and labels in OSM-Carto

Conflicting symbols and labels in OSM-Carto Conflicting symbols and labels in OSM-Carto

But because the symbols and labels are rendered completely independently this also means that labels will be placed even for features where no symbol has been rendered before. It would be possible to connect the two and render the combination of symbol and label only if there is space for both (via shield symbolizer) but only with both having the same priority. It is not possible with the tools used by OSM-Carto to render the labels with lower priority than the symbols and still tie the display of the labels to the corresponding symbol being also displayed. This leads to the awkwardness of offset labels (which are meant to show up below or above the corresponding symbol) being displayed without the corresponding symbol, leading to confusion because it looks like either the label is for a different feature or the label being at the wrong location (because of the offset).

There are several other big issues with the point symbol and label rendering system in OSM-Carto. One is that the actual style rules for rendering the symbols and labels are an unstructured >3000 lines of CartoCSS code that is essentially unmaintainable. Rules have just been added to that over time with no particular order and there is plenty of dead code in there leading to confusion. This mess is a major source of styling errors like symbols or labels having the wrong color or appearing at the wrong zoom levels.

The other big problem is that the drawing order within the two layers (which defines which symbols have priority over which other symbols and likewise for labels) is arbitrary at the moment. Developers have usually added new feature types without making a conscious choice about the priorities. At the same time not all point features are in the two main POI layers described above. Because it is considered desirable for the labels of major POIs like shops and restaurants to have priority over minor POI symbols like benches or wastebaskets there is a separate pair of low priority POI layers.

As a result of this it is not uncommon for point symbols and corresponding labels to turn up and vanish again as you zoom into the map in a highly non-intuitive way.

Under the described constraint of not being able to tie the label rendering for a feature to the successful placement of the corresponding symbol without rendering it with the same priority, there are mainly two sensible strategies to pursue – none of which is ideal in any way:

  1. to keep rendering the symbols and labels separately. This leads to the known problem of having offset labels without symbols in the map. Also, since the labels will universally have lower priority than the symbols – independent of the starting zoom levels, there will still be cases where labels vanish as you zoom in, because new symbols start turning up with priorities higher than the labels.
  2. to tie the label to the symbol and render them with the same priority with the disadvantage that in particular lengthy labels for high priority symbols will block lower priority symbols leading to, overall, a lower number of POIs being displayed. This has the advantage of universally avoiding the ‘offset label without corresponding symbol’ phenomenon. It also avoids the vanishing symbols problem, at least as long as symbols and labels start at the same zoom level.

As explained both approaches have their disadvantages. What we would really like instead is rendering the symbols as in the first approach but then render the labels only for those features for which a symbol has been successfully rendered. This, however, is not possible with the current tools.

Prioritizing by starting zoom level

Independent of which of the two strategies in symbol and label rendering we ultimately want to choose – the key to minimize the effect of symbols and labels vanishing as you zoom in is to prioritize them by starting zoom level. This in principle is possible to do in the OSM-Carto setup – but it would be a substantial amount of tedious work to adjust the priorities accordingly. And every time the starting zoom level of some symbol type is changed you have to re-arrange the whole thing. So this is definitely not sustainable.

This need to consistently prioritize rendering by starting zoom level was the main reason for me to look into re-designing the whole symbol and label rendering in the AC-Style. The basic idea is to specify the design parameters (like symbol shape and color, label design) and the starting zoom levels of all the different feature types and variants rendered and then have a script automatically generate the MSS and SQL code necessary to render these with the desired prioritization.

If you look at the auto-generated MSS code you will notice the absence of zoom level selectors (with the exception of those cases where design changes with zoom level). That is because for prioritization the starting zoom level needs to be available on the SQL level anyway so it is prudent to also filter by zoom level in the SQL query already. In other words: In contrast to classic OSM-Carto the SQL query of the symbols and labels layer only returns those features that are actually rendered and the MSS code only selects how to render them.

Label formatting

In addition to the difficulties of rendering labels and symbols together as discussed above there is another annoying thing with label rendering in a CartoCSS based system. Carto does not provide the means to format different parts of a label with different stylings despite Mapnik having provided support for this for quite some time. This means that if you want to separately format different parts of a label differently you have to use separate labels, resulting in the same problems as with symbols and labels – that you have no control over which of them actually get rendered in the presence of other competing blocking elements.

Carto had at some time in the past a hack that allowed integrating Mapnik XML code into a CartoCSS label string and this way supporting differentiated label formatting. But that feature was removed at some point. I re-added it and use this in the AC-Style – which, however, requires using a custom Carto version.

Notes on the implementation

Since i am not a software developer and i have neither the ability nor the ambition to develop a generic framework for symbols and labels rendering the implementation of what i described above in the AC-Style is fairly awkward and non-ideal. This applies both to the python code generating MSS and SQL and to the form in which you specify the symbol and label design that is interpreted by the script.

I emphasize this because elsewhere i have on several occasions pointed out the importance of tools in map rendering being developed for the needs of the map designer rather than for the convenience of the software developer. And this mock-up point symbol and label rendering definitely does not meet this requirement. It is merely a demonstrator showing the concept in a relative simple form. While this will be self evident to most readers i feel it is prudent to point out that software developers should not take this as a blueprint for their work.

The important thing to note is that the script implements both of the strategies i described above – you can choose which of them to generate the code for. To avoid this blog post getting too long i will not discuss the details of how this works here.

Modularization of the style

What i want to mention though is that for being able to use script generated SQL code for the symbol and label layer i modularized the project.mml file. There is now a layers subdirectory containing the individual layers (or blocks of several layers) and there is another script that assembles these into a single project.mml file based on a layers.yaml configuation file.

Apart from technically facilitating the use of auto-generated layer definitions there are mainly two reasons for this:

First – with the increasing size of SQL queries in the AC-Style (in particular the roads layer of course) project.mml became fairly hard to maintain and i was also thinking about how to facilitate easier debugging of the roads layer – which would probably involve auto-generating this as well.

Second – i am thinking about modularization of the style to allow users to choose which of the more sophisticated (and often slower) features they want to use and which not. The layers.yaml file allows specifying tags for the different layers and these should in the future allow generating custom variants of the style with specific features being selectively enabled or disabled. This so far is just the basic setup allowing this in principle. I have not actually done any work making this practically useful yet.

Practical results

Here a few examples of POI rendering with the new system. First in the symbol and label separate version (similar to OSM-Carto, but with consistent priorities):

AC-Style with separately rendered symbols and labels

AC-Style with separately rendered symbols and labels AC-Style with separately rendered symbols and labels

and in the combined version:

AC-Style with combined symbol and label placement

AC-Style with combined symbol and label placement AC-Style with combined symbol and label placement

Note in this second variant, if the combined shield symbolizer (symbol+label) is not successfully placed, the symbol only variant is tried. And if that fails a centered label is tried as well (second from the right). This looks a bit like the third form the right in the separately rendered version with the offset label and the symbol of two different features being shown together in a misleading fashion. Note, however, here the symbol and label are both centered on the respective POI locations, hence this is much less misleading.

You can also see the differentiated formatting of the label with the elevation values in italics. The CartoCSS code for that (requiring – as indicated – a patched version of Carto) is

shield-name: '[name]<Format face-name="Noto Sans Italic" size="9">[int_elevation]</Format>';

Another example which i derived from real world data i mentioned as a good example of how the inconsistent prioritization leads to confusing results in OSM-Carto. The data around there has meanwhile changed so the map on osm.org does not show this any more – and i made some adjustments to serve better as a demonstration.

First how it looks like in OSM-Carto. This is a sequence of samples as you zoom in from z15 to z19. The peak on the right gives you a sense of the change in scale.

OSM-Carto z15

OSM-Carto z16

OSM-Carto z17

OSM-Carto z18

OSM-Carto z19

At z16 you can see the hut symbol vanishes because it is blocked by a picnic site symbol – which starts being shown at z16 with higher priority than the wilderness hut, which was already visible at z15. The hut label remains because it is rendered independent of the symbol. At z17 both the picnic site symbol and the hut label get blocked by the newly appearing fire pit symbol, which continues to block the hut label (but not the symbol) at z18 – while the tower symbol appears. At z19 hut, picnic site and fire pit are all visible – plus some other minor symbols – while the tower symbol vanishes, being blocked by the newly appearing map symbol. And the tower label appears newly at z19 while the corresponding symbol vanishes – adding further to the confusion.

I admit this is a bit engineered for demonstration – though not that much – the original mapping in 2022 already contained many of the flaws this demonstrated. So these are not extremely rare corner cases, these are things happening in the map all the time.

Lets see how the new setup in the AC-Style deals with this. First the variant with separate symbols and labels:

AC-Style separate z15

AC-Style separate z16

AC-Style separate z17

AC-Style separate z18

AC-Style separate z19

And second – for comparison – the variant with symbols and labels rendered together.

AC-Style combined z15

AC-Style combined z16

AC-Style combined z17

AC-Style combined z18

AC-Style combined z19

As you can see there are no vanishing symbols in either variant any more as you zoom in. And labels vanish in the first version only. But tying the labels to the symbols of course leads to less symbols being shown because they are blocked by labels. This is only a small factor in this specific test example, practically the effect is often quite significant.

Summary

What i showed here is – as already indicated in the beginning – not a solution to the core problem of combined point symbol and label placement in maps. That would require support from the underlying rendering framework. It does, however, demonstrate how consistently prioritizing labels and symbols by starting zoom level can lead to a much improved map viewing experience with less inconsistencies and how such consistent prioritization can be ensured automatically in a Carto+Mapnik based map style.

January 26, 2023
by chris
0 comments

On the OpenStreetMap Foundation 2023 strategic planning initiative

There seems to be a new initiative initiated by members of the board of the OpenStreetMap Foundation to develop more elaborate strategic planning for the organization. This so far has not been widely communicated in public but it also is not explicitly developed behind the scenes, meaning that you can get some insights into what is happening by observing public communication. That is in particular the public board meetings, the minutes of both the public and the non-public meetings and the changes on the public OSMF wiki. Based on the lack of public discussion (not only on this but also on previous subjects of deliberation in the OSMF) not a lot of people in the OSM community seem to be following these but i like to emphasize that what i am discussing here is based on public information available to all interested OSM community members.

Specifically, the documents on the new initiative on strategic planning can be found here:

Much of the text seems to be the work of Allan Mustard. I see this in a positive way as being developed by someone with a non-technical background, but also in a critical way as coming from someone very firmly rooted in US culture and cultural values. Also, Allan is deriving much of his experience from a career in the US federal government, which is one of the largest strictly hierarchical organizations on the planet (not the largest though – that is almost certainly the Communist Party of China). This shapes not only his analytic view but in particular also what kind of solutions and approaches to solving problems he considers.

In particular remarkable is the first of the listed documents, which starts with the most critical analysis of OSMF procedures and actions i have seen from within the OSMF for a very long time. It is not directly criticizing the OSMF but is phrased like what happens if you neglect to…. It is clear though that this characterizes the current state of the OSMF.

Based on my own observations i would say this is a pretty accurate analysis how the adoption of a centralized, hierarchical work culture in the OSMF without stringent strategic planning and management has led to a highly problematic muddling-through. This is unnecessarily combined with a disparaging remark about tiny family owned firms. That is probably how you could classify the vast majority of companies in the wider economic OSM ecosystem, many of which are a massive source of volunteer contributions in the OSM community in some form, which makes this, when coming from the OSMF, kind of a biting the hand that feeds you statement. And in my experience it is not much based in reality, many of the family owned businesses i know have a much more solid strategic planning of their business than larger corporations – which might produce some colorful glossy paper plans for their investors, which, however, are often not worth the paper they are printed on. Muddling-through is something i see much more frequently in large corporations and institutions than in small businesses. But that might be a bit of a US-Europe business culture difference. Or it might be simply that i know too few badly managed family businesses (which evidently exist). I am getting side tracked here though, back on topic.

The spot-on analysis of the current issues with the OSMF muddling-through is a good starting point and could then have led to a discussion of the two main options to address that problem:

  • Changing the work culture of the OSMF to be less centralized and hierarchical, introducing a clear subsidiarity principle, becoming more open towards the very different work culture of the OSM community, all the things i have been suggesting for years.
  • Moving to a full adoption of central hierarchical management ideas.

Both are, in principle, valid approaches to the problem. And i don’t want to say that choosing the first is the only defensible decision. What in my opinion makes the first one so much more viable and less risky is the nature of OpenStreetMap as a highly non-hierarchical openly cooperative and do-ocratic project. Choosing the second approach means rejecting the core elements of cooperation within OpenStreetMap as the basis of cooperation in the OSMF. And that in itself creates substantial problems.

One likely effect of moving full hierarchical management would in particular be the continued and accelerated replacement of hobbyist volunteers in the OSMF with borrowed labor from corporate/institutional stakeholders. Hobbyist volunteers tend to have very little inclination in volunteering their time for work under a hierarchical management, especially if that management is in pursuit of economic goals.

I want to in particular point out that because OpenStreetMap as a whole is so deeply non-hierarchical, we have quite a lot of projects in the OSM-Community that have, over the years, collected valuable experience with what works and what does not in non-hierarchical human cooperation at scale. So the OSMF would be in an excellent position and would have good access to competent advice in that regard. Of course there are also plenty of examples outside of OpenStreetMap for less centralized approaches to human cooperation. Even in the military domain – most strategies on Guerrilla warfare focus on avoiding centralized hierarchies and build on small, fully autonomous units.

Economic direction and business model

When i look at the writings regarding the strategic planning the most striking overall observation probably is the absence of anything of substance in terms of the basis of the strategic plan – values and mission. There is this fairly elaborate tree structure of ideas and tasks to pursue but nothing of a foundation below that motivates why these things are significant and not others. I see two main reasons for that:

The first is that this is the really hard part. As i have pointed out in the past, the OSMF has over many years substantially neglected and undervalued intellectual work relative to technical work and this is the kind of thing where this becomes a serious problem. Past attempts of the OSMF at drafting and adopting value statements have also not had very positive results.

The second reason – i think the OSMF board at the moment tries to preserve a strategic ambiguity regarding the core conflict of the organization. The previous strategic plan outline (which the OSMF board had adopted as official policy) positioned itself fairly clearly at aiming to move OpenStreetMap from being primarily a social project of cross cultural cooperation towards the goal of collecting useful geodata (in particular demonstrated by putting individual local knowledge contributors on the same level as corporate providers of satellite imagery and AI mapping tools). This idea still shines through in this year’s texts – but it is more moderated now, preserving at least formally the possibility to move in either direction in terms of goals.

The main reason for that is probably the Overture initiative.

To explain that i will need to provide a bit of context. Right now, the OSMF needs cash inflow of about half a million per year to be able to pay its current expenses (mostly personell) and it is made clear in the text that the authors think this needs to grow significantly and permanently in the future (the arguments for that necessity are a bit sketchy – but i will not go into details on that here).

Before Overture, the OSMF had at least in principle the perspective to secure this kind of money (and possibly significantly more) by presenting themselves as provider of useful geodata to OSM data users and as an insurance that this data continues to be produced and maintained. I discussed this perspective already in previous posts. This, however, seems to be exactly the market segment the Overture consortium is targeting with its initiative now.

An obvious reaction to that would be for the OSMF to reposition itself as what it has aimed to be for many years, the guarantor behind OpenStreetMap as primarily a social project of cross cultural cooperation in collecting local knowledge. Based on the confidence that the way OpenStreetMap collects local knowledge through egalitarian cooperation of individual local mappers world wide will continue to be an essential component for many applications in need of geodata, important enough to ensure financing the OSMF, this would be the natural and logical reaction in my opinion (and this is what i hinted at with my hypothetical OSMF statement regarding Overture).

I have a bit of the impression that the problem here is that many in the OSMF these days do not have the confidence that what OSM is traditionally based on, grassroots mapping by people with local knowledge, is going to continue to be economically important enough in the future for the OSMFs financial needs including the need to grow as perceived by them. And as a result they try to secure a piece of the cake Overture is eyeballing while trying to avoid alienating the global mapper community. I probably don’t need to mention that this is a fairly risky endeavor. In German we have a term for that: Der Versuch, auf zwei Hochzeiten gleichzeitig zu tanzen – the attempt to dance at two weddings simultaneously.

Of course the other route is not without risks either, especially since it touches the market segment HOT is currently occupying – in particular since HOT is increasingly trying to present itself as supporter of local mapping communities in developing countries. So the OSMF economically would be kind of perilously wedged between two much larger players (HOT and the Overture Consortium). Being extremely passive in clarifying the relationship with either (with HOT in particular w.r.t. trademarks, with the Overture partners w.r.t. ODbL compliance) is not of benefit there of course.

What is missing

As so often with plans and documents like this it is in particular interesting to look at what is missing and not just at what is there. Of course these are drafts and meant to be extended. Even at the early draft stage you can, however, already see priorities.

One amazing thing in light of how much in detail the need for strategic planning as an instrument of hierarchical management is explained and motivated is the complete absence of any thoughts on the need for oversight. Since the motivation and justification of the whole thing is based on a traditional western management perspective, which itself (as explicitly mentioned!) has its origins in military command structures, i could also put it in military terms: There is no disciplinary framework.

In corporate management the primary disciplinary mechanism through which people further up in the hierarchy are able to tell people further down in the hierarchy what to do is money. And since the strategic plan of the OSMF explicitly states that its focus is on volunteer work, this omission is particularly striking. If that is not addressed this will most likely lead to continued muddling-through based on the currently dominant people whose work we know and enjoy paradigm – meaning the primary instrument of exercising power in the management hierarchy would be shared personal interests and personal inter-dependencies between the management and the workforce. And since that is not very reliable as a mechanism of control, this could also lead to a further shift from volunteer work to paid work, simply because the latter is so much easier to handle in a management hierarchy.

Looking at the details of the plan with the clusters we can see that the focus is – as kind of expected – on technical work and management. The whole domain of intellectual work is not covered. I will leave out some of the more concrete fields where i do work in the OSM context – map design and tagging documentation – because you could well argue that this is outside the scope of the OSMF. Though understanding the semantics of the data the OSMF evidently sees as one of its most important assets seems like one of the things to invest some resources in. And in light of how much resources the OSMF is investing in map rendering the complete absence of map design in the strategic plan is in a way remarkable as well. But more important: What do you think about an organization that aims to support OpenStreetMap, a project that without doubt is highly unique and unusual in its social interactions, and the aim to gain some better understanding how these social interactions work is nowhere to be found in the strategic plan?

In corporate terminology: The cluster Strategic Research is notably absent.

But to be very clear: Just formally adding something like that will not have any benefit. You need some expertise in the field to actually make a meaningful plan, even if the main goal of the plan is to produce that competency. This is a hen-and-egg problem.

This kind of closes the circle to my starting remarks. It is good to see an initiative in the OSMF that is based, as far as i can see for the first time since many years, on an critical look at the OSMF with at least a bit of an outside perspective. But that initiative is only going to yield substantial positive results if the OSMF can actually convince people knowledgeable in a broad range of fields from outside the OSMF and from outside the people whose work we know and enjoy of the OSMF to contribute their expertise. And that will either cost substantial money or will require the OSMF to provide an environment knowledgeable and experienced people find attractive to contribute even without economic incentives. This rules out a management hierarchy or the system we have all too frequently seen in the past years where decisions are made as a negotiation of interests rather than a battle of arguments.

Risks of the approach

And this gets me to what i see as the main concrete operational risks of pursuing the sketched procedure to develop a strategic plan. Because of the established work culture in the OSMF and because there is decidedly not a solid foundation in terms of values, mission and core goals that motivates the tasks planned, there is a strong possibility that this will not become a well structured plan of how to accomplish the goals of the organization but a collection of projects that are in the interest of those who have been most influential in developing the plan.

Another problem – and this possibility is already hinted at in the text – is, that this will not be a plan pursued as a whole in the belief that diligently pursuing all of these tasks will be the key to successfully fulfilling the mission of the OSMF, but that it will become a catalogue of things the OSMF offers as services to its financiers that they cherry pick from what they consider to be suitable for serving their goals.

Conclusions

But i want to make clear that none of my critical comments here are a reason to abandon the idea to develop a strategic plan and instead continue with the muddling-through. Even with all the risks and deficits described, developing and publishing such planning would be a substantial improvement from the status quo. It would of course be good if there was a serious discussion of the alternative of de-centralizing much of the work of the OSMF as sketched in the beginning. But i realize that this is not very realistic as an initiative from within the OSMF at the moment. It would also be great if there was broader independent expertise involved in developing this kind of plan from ground up with the broad range of competencies needed and without interests dominating over expertise and arguments. But i realize it is difficult without a substantial budget to recruit qualified people for that who are not primarily in pursuit of their own economic interests and who are willing to work within a hierarchical management framework. And even if nothing like that happens it would still have the advantage to make explicit the things the OSMF is pursuing instead of the current muddling-through with no clear strategy visible from the outside but extensive pursuit of special interests behind the scenes.

Multilingual name rendering

January 19, 2023
by chris
3 Comments

Rethinking name labeling in OpenStreetMap map styles

As i have explained in the previous blog post i wanted – in light of the situation of name tagging in OpenStreetMap having been stuck for many years now – to try presenting a proof of concept how a solution to this could look like that both demonstrates the practical benefits and that shows a transit path from the status quo towards more elegant mapping concepts while maintaining the paradigm of map design to support, but not actively steer.

The difficulty with doing that is that – as i have discussed many times already now – the tools available for rule based map design have largely been stuck for even longer than name tagging in OSM. Commercial map producers have for the last ten years essentially a hundred percent focused on creating maps for the language needs and preferences of their target customers – which is a completely different problem than to create a map which shows the locally used names everywhere on the planet.

Beyond that, the whole domain of label and symbol rendering in OSM-Carto is also in a precariously unmaintained state for many years now. So when i decided to work on this i needed to do quite a lot of yak shaving first as prerequisite of implementing the name rendering in a way that is not too awkward (and in particular: To avoid the need to make tons of changes in the essentially non-maintainable and fairly chaotic amenity-points.mss file). Many of the changes made (in particular the modularization of the layer definition and the re-design of the POI symbol and label rendering with auto-generated SQL and MSS code) deserve a discussion on their own. In summary, all of these changes are highly experimental, even more so than most previous changes in the Alternative Colors style. Much of the design is code wise not very elegant. As said: this is meant to be a proof of concept, not a ready-to-use product.

As i have explained in the previous blog post, the situation with mapping of local names in OSM is severely stuck. So the usual approach of OSM-Carto to very directly depict what mappers map is not going to work because the only thing to do then would be to universally stop rendering labels based on the name tag – which is evidently not a very constructive approach. There is the possibility though to keep rendering the main use case of single names in the name tag but to selectively stop rendering the more problematic variants. This is the approach i have taken here.

One development that helped a lot with doing this is a tagging idea that was suggested a few years back as a solution for problems of rendering a map for a specific language audience (the labeling strategy of commercial maps, labeling for a specific target map viewer rather than displaying the local name everywhere) that has gained some traction meanwhile – the default_language tag. Originally meant to record the language that is the most likely language in the area (hence maintaining the illusion of there being a single local name for everything) it was recently modified by mappers in a way that is fairly close to what i suggested back in 2017. I will get to how i use this information in the following.

Interpreting multilingual compound names

Let’s first start with explaining the new name rendering logic in the hypothetical case of an English-German multilingual region:

basic name rendering

First of all plain single names in the name tag remain being rendered as is. The primary difference (shown in the second line) is that compound names in the name tag alone (without any other name tags containing the individual names) are not shown any more. Compound names are here defined as names that:

  • contain one of ‘ / ', ' - ' or ';' (slash/dash with spaces around or semicolon)
  • contain words (separated by spaces) composed of characters from different scripts

And as you can see below you can have literal semicolons in names by escaping them with double semicolons:

name with escaped semicolon

Remember that we are in a hypothetical English-German multilingual region – but since we have no way to know that so far, it is clear that name:en and name:de are not affecting the name rendering, a single name in the name tag has priority.

If, however, components of the compound name tag are separately tagged in name:en or name:de (or any other name:<lang> for that matter) – things change:

multilingual name rendering variants

As you can see, i only render those parts of a compound name that can also be found in the individual name tags.

What you can also see is that the separator used in rendering is a line break. This is for horizontal point labels, for line labels (like on roads, not shown so far) i use ' - '.

Single language compound names

But what about compound names with components from the same language? Here the logic is very similar, just that the individual names tags interpreted are official_name, loc_name, alt_name and name:right/name:left. name:right/name:left are even shown when there is no name tag.

Compound names in a single language

This rendering logic is not directly supported by established mapping practice, it is an extrapolation of the consensus we have on multilingual names. Because there is – based on established mapping practice – no reliable way to distinguish between compound labels of a single language and multilingual compound labels, the logic has to be the same in both cases. This could change in the future through interpretation of default_language – which i am going to discuss in the following.

Using default_language

So far what i have shown is simply how map rendering can enforce the established convention that the components of a compound name tag should be also separately recorded in individual tags. This neither solves the han unification problem nor does it provide a path towards a substantially more sustainable mapping practice for name mapping. This is done by interpreting the default_language tag.

default_language was originally suggested as a tag to record the most likely language of the name tag within an administrative unit. It is, however, predominantly and increasingly used for tagging the primary or most common language or languages of an administrative unit. I use it in both meanings. There does not appear to be a clear consensus on the delimiter in that tag in case of multiple languages yet, i allow the same delimiters as for the name tag.

As the tag is applied to administrative units and not to individual features, you need to preprocess the administrative boundary data to use that efficiently in map rendering. I have a proof-of-concept process for that as well but have not published it so far because it is very preliminary – you can, however, download the results.

In addition to the default_language tags of the administrative divisions as mapped, de-duplicated in overlaps with priority on the higher admin_level, i also modify the language information in Chinese speaking areas to provide information about simplified vs. traditional Chinese (more on that later):

UPDATE boundaries SET default_language = 'zh-Hans,zh' WHERE default_language = 'zh';
UPDATE boundaries SET default_language = 'zh-Hant,zh' WHERE country = 'tw';
UPDATE boundaries SET default_language = 'zh-Hant,zh' WHERE country = 'hk' AND admin_level = 3;
UPDATE boundaries SET default_language = 'zh-Hant,zh' WHERE country = 'mo' AND admin_level = 3;

Here is how this default_language attribute is interpreted in case of our hypothetical English-German multilingual region:

name rendering with default_language

As you can see the default_language value is, in this case, used for two purposes:

  • to decide on the order of names in rendering.
  • to render name:<lang> for lang in default_language in cases if there is no name tag.

How to write things is as important as what to write

In addition – and this might actually be the more significant part of the change because it affects a large number of people not only in multilingual regions – i use the default_language information to select different fonts in different parts of the world to address the Han unification problem.

A big disclaimer here: I am no expert in typography of any of the scripts where i vary the font based on language. If the style of script actually represents local practice in these regions and if the change as implemented here leads to better usability of the map is not really clear to me. This is not ultimately what this is about though. What i try to show is the principal feasibility of adjusting the fonts based on language.

Here is how it looks like for Chinese, Japanese and Korean (CJK). The sample towns chosen (Chuanchang and Mifune) might not be the best to demonstrate this – but they show the principle. Because the difference might be difficult to properly see at this size you can click on the sample to get a double resolution version.

Regionally adjusted label styling for Chinese and Japanese

Especially the following things can be observed here:

  • The default rendering is Simplified Chinese. This differs from OSM-Carto, which defaults to Japanese. However, based on number of people affected, Simplified Chinese is clearly the more logical choice. Note the default is only relevant for parts of the world where default_language does not change the script style (i.e. countries outside the CJK regions).
  • For single names in the name tag like here the default_language overrides the individual name tag matching to determine the language. Otherwise Chuanchang would not be rendered in the Japanese variant with default_language=ja because there is no name:ja.
  • The tagging of name:zh, name:zh-Hans and name:zh-Hant is redundant in this case because all variants are identical.

Some real world tagging examples from outside the CJK domain:

Non-CJK multilingual names and font style variants

Note in particular the reason why it is Brussel – Bruxelles without default_language is because the order in rendering is alphabetical w.r.t. language code based on matching with the individual names, and Brussel matches name:af here, which comes before name:fr. default_language helps resolving such ambiguities.

Some comments on the implementation:

Because i implemented this change in name rendering in combination with various other changes of significant impact on the style overall, it is a bit difficult probably to understand by looking at the code changes. The main part of the functionality is implemented in a number of SQL functions you can find in sql/names.sql. In the queries of the style layers this is used like:

  (SELECT
      way,
      name_label[1] AS name,
      name_label[3] AS font,
      ...
    FROM
      (SELECT
          way,
          carto_label_name(way, name, tags, E'\n') AS name_label,
          ...
        FROM planet_osm_polygon
        WHERE ...) AS _
    WHERE name_label[1] IS NOT NULL
  ) AS layer_name

And on the CartoCSS level you then use something like:

#layer {
  [zoom >= 14] {
    text-name: "[name]";
    text-face-name: @book-fonts;
    [font = 'jp'] { text-face-name: @book-fonts-jp; }
    [font = 'tc'] { text-face-name: @book-fonts-tc; }
    [font = 'kr'] { text-face-name: @book-fonts-kr; }
    [font = 'ur'] { text-face-name: @book-fonts-ur; }
    [font = 'bg'] { text-face-name: @book-fonts-bg; }
    ...
  }
}

Because of Mapnik’s particularities in interpreting fonts (newer versions of Mapnik seem to try to be over-clever and guess the language of a string and then select font locl accordingly) the only way to actually control the script type is to use font files which only contain a single set of glyphs and not allow choosing one of several with the locl mechanism. For Noto Sans CJK these single language fonts are readily available from Google, for Bulgarian Cyrillic i had to modify the fonts used (Acari Sans) and remove the Russian Cyrillic variant. See get-fonts.sh for details.

Some potential questions:

That was just a very quick look over this change in name rendering, skipping a lot of details. There are a number of issues i anticipate readers to see that i will try to discuss briefly here:

The name interpretation logic is complicated, doesn’t that clash with the goal to be intuitively understandable for mappers?

Yes, that is a valid point. Keep in mind, however, that a very simplistic interpretation of name tagging is what got us into the current situation in the first place. The world of geographic names is complicated and this can only be properly dealt with if mappers and data users alike embrace the complexity.

Doesn’t this actively steer mappers into changing their mapping practice?

IMO much less than the current practice of plainly rendering the name tag no matter what. The overall idea of this change is to pick up on the few things where there mappers seem to have a clear consensus and to do the necessary extrapolations/extensions of that to create a viable and consistent rendering logic that supports mappers in a consistent mapping practice based on their own consensus decisions. And the rendering logic is deliberately designed to allow adapting to future changes in mapping practice. If default_language, for example, is in the future more widely adopted, it can be considered to allow it overriding the name tag (like for example if a language in default_language is tagged individually as name:<lang> but not as part of name, or to prefer name:<default_language> over a single name in the name tag). Some decisions that might be considered steering (like allowing default_language on the feature level) are currently in the code mainly to facilitate easier testing.

Wouldn’t this require all data users to use a similarly complex logic.

No, the good things about this approach is that data users satisfied with the status quo can continue to interpret the name tag as a label tag. The quality of the name tag might over time further degrade, but this is likely to happen anyway. And if mappers embrace the possibilities this approach offers for more cleanly structured name tagging (in whatever specific form they decide to do so), it will likely become much easier to semantically interpret the name data than it is right now.

Isn’t the logic of splitting compound name tags awfully complicated? Wouldn’t it be much easier to just standardize on semicolon as separator?

I don’t think there is a huge difference between supporting one or supporting three separators. The detection of different scripts to separate compound names without separator (as it is common in particular in Africa) is a different matter. But i am pretty sure once there is a viable way to get multilingual name display without an undesired delimiter, the local communities would not be opposed to changing that. For the moment this complexity is there to support all the common multilingual name tagging variants with equal determination.

Isn’t the preprocessed default_language data highly prone to vandalism because a single tag change can massively affect large parts of the map?

If you process the data in a simplistic way (like a process newly run on a daily basis) then yes. But for data not subject to rapid changes (languages of regions do not change from one day to the other) this is a solved problem. You have to apply some change dampening, i.e. roll out a change in tagging in the processed data only once it has proven to be stable over a longer period (like a week or a month). This is not rocket science, just a bit of hand work.

Shouldn’t other tags (like brand, operator, addresses) be subject to the same font selection logic as names?

Yes. As said, this is a proof-of-concept demonstration, not a ready-to-use product.

Nice, but where is the point? If this is not included in OSM-Carto it is extremely unlikely that it will have a substantial positive influence on mapping practice.

True. But you have to start somewhere. As said, nothing of substance has changed for the past five years. This is an attempt to provide a perspective how things can change. It is up to others to decide if they support the idea.

Real world examples:

How does this look like with real data? Here you go:

In most cases some compound labels are missing because the individual names are not in other name tags or because they are written differently in those. And you can see that in some cases the order of compound labels is different because of the logic described above.

January 19, 2023
by chris
2 Comments

Names and labeling in OpenStreetMap

More than five years ago i wrote a blog post here on the problem of name tagging and multilingual names in OpenStreetMap. I want to give an update on the situation from my perspective and reflect a bit on what has changed since then and what has not. Also, this is meant to serve a bit as an introduction to a following blog post i plan about changes to the rendering of labels in my map style.

What i explained five years ago is essentially that the primary way through which mappers in OpenStreetMap record names, the name tag, has been central in establishing the idea of OpenStreetMap recording the local knowledge of its contributors because it is, by broad consensus, explicitly meant to record the locally used name of geographic features. Since every mapper anywhere on the planet does this in exactly the same way and sees this local name directly on the map as they have recorded it, this practice well represents the idea of jointly mapping the planet in all its diversity.

On the other hand – what i also mentioned already five years ago – this practice unfortunately comes with various problems:

  • it is based on the illusion that there is always a single local name.
  • recording the local name without any information on the language of that name creates serious issues in some cases (in particular in what is commonly known as the Han unification problem).
  • what mappers want to see as the label of features in the map very often is not the locally used name.

The solution from the tagging side that i proposed back then as well as various other attempts to suggest tagging ideas to address the first two of these problems did not find broader support. Both mappers and data users seemed mostly content with the status quo.

The situation today

In a nutshell: The situation today and the problems described are essentially unchanged. The problems which were already visible back in 2017 have, however, aggravated and it is visible now quite clearly that the current practice is not sustainable. Trying to categorize the use of the name tag today i get the following:

The vast majority of name tags record a single name. In most cases this is the locally used name. However, there is also a significant percentage of cases where non-local mappers record names they use or are familiar with in the name tag despite them obviously not being the locally used name (like because they are in a language not widely spoken locally). In other cases local mappers use a signed or official name (i.e. one endorsed by some sort of authority, either political authority or the owner of the feature in question) rather than the name used locally in practical communication.

All of these variants of recording a single name in the name tag that is not the locally used name are largely motivated by the fact that many data users either only interpret the name tag directly as a label tag or give it priority over other tags with names when selecting what to label. This incentivizes mappers to record the name they want to see on the map – which, depending on the individual mapper and the local cultural conventions in map design, can mean different things.

The second most common use case for the name tag after recording a single name of some kind is to record something that is not a name at all. This practice is partly fueled by the same mechanism i already discussed. The other reason why we see this is because the concept of names (or more precisely: proper names) is a highly abstract matter that quite a few mappers have difficulties with. A significant percentage of uses of the name tag for things that are not a proper name are cases where mappers record a real world label that is not showing a proper name of the feature. Often this is a brand (like in case of shops etc.), the name of a person operating a feature (that would usually be tagged with operator), address components or a classification (think of a playground where there is sign indicating use restrictions: Playground: use only allowed for children of age under 14 – and the mapper interprets the Playground as the name of the feature).

Multiple names

The third most common use case for the name tag – and we are down to a single digit percentage here, though for specific feature types and in some regions this is much larger – is the recording of multiple names in the name tag. This, likewise, is largely motivated by data users showing the name tag as is as a label and mappers in some cases seeing the display of several names in the map as the most desirable form of labeling.

There are two fundamentally different variants of this recording of multiple names in the name tag. One (and the vastly more widespread one of the two) is the recording of names in different languages. If there is not a single locally used language but several ones, a feature will usually have different names in these different languages and accordingly no single locally used name. The other (much less common) variant is when there is more than one locally used name in the single locally used language and it is not possible to verifiably determine which is the locally more widely used name.

I like to add that for both variants of this there is also the rather common case that a single locally most widely used name can be determined, but mappers – for social or political reasons – do not want to specify that and prefer to record several names as if they are equally widely used locally, even if they are not.

There has been some renewed interest more recently in the OSM community on the subject of recording multiple names in the name tag but most of the discussion dealt with the rather superficial and insignificant question how to separate the different names in that case. This is kind of like planning to pave over the cracks and discussing what shape of paving stones to use for that purpose.

In practical use the most common separators for multilingual compound names are ' / ' and ' - ' (that is slash or dash/minus with spaces around) and – for compound names where the components are distinguished through different scripts – it is also common to not have a separator at all (using just a space, which equally is used between different words of the individual names). The semicolon (;) has some use, but mostly in non-multilingual situations, in particular in cases where the different names refer to different real world features.

Despite all these inconsistencies and conflicting mapping practices there is one broad consensus about multilingual names: If there are names of multiple languages in compound name tags, each of the components should be also separately recorded in the respective name:<lang> tag. While this rule is not universally followed, there is clearly broad consensus that this is desirable.

The dilemma in map rendering

The core of the problem on the social level (and the reason why nothing of substance happened in that field for the last five years) is that the illusion of the name tag recording the single locally used name is rather convenient for data users because it allows them to label their map in a very simplistic fashion and at the same time the ability to directly control the labels in maps is something mappers have come to value and many mappers do not have the confidence

  • in data users to be able and willing to interpret more cleanly structured information in a way that results in good maps.
  • in their fellow mappers to diligently record more cleanly structured information on names (and the inconsistent use of the name tag is directly confirming that suspicion).

What i (and others) have tried in the past is presenting ideas how the problem can be addressed from the mapping side. But neither data users nor mappers turned out to be very keen of changing the status quo.

Also – both on the mapper and on the data user side – the most influential circles in the OSM community are from Western Europe and North America and primarily interested in languages using Latin script – hence have only little interest in solving the problems the lack of information on the language(s) of the name tag causes.

The dilemma for map designers, in particular in OSM-Carto, is, that it is completely clear meanwhile that the simplistic rendering of the name tags as labels for many features in OpenStreetMap is a large contributor to the bad situation as outlined above. In other words: We as map designers are part of the problem and it therefore would be prudent in a way to stop doing that. But since we want – for the reasons mentioned above and in the blog post from 2017 – to continue rendering local names in the local language everywhere on the planet and because there is no established way of record that information in OSM other than the name tag, we have no way to really do that other than displaying the name tag. And we do not want (for very good reasons i explained recently again) to actively steer mappers to change their mapping practice in a way we consider desirable.

One approach to try to overcome this stalemate between mappers and data users could be to actually present a working proof of concept showing both the benefits of actually solving the problem (instead of just paving over the cracks a bit – see above) and how a smooth transit from the status quo to such a solution could look like. And above all – to show that following such a route does not require mappers to relinquish control over mapping practice to a small elite of technical gatekeepers but would allow mappers to continue making self determined decisions and to present a solution that respects and builds on the few points described where mappers actually have consensus about how names should be mapped. This is what the next blog post is going to be about.