[JT] Jochen Topf's Blog
Thu 2022-11-03 18:17

Generalization of OSM data

OpenStreetMap data is often incredibly detailed and the generated maps look amazing. But there is a problem for maps in smaller zoom levels/on smaller scales: There is often too much detail. Maps become slow to render and cluttered, the important information is hard to see. To solve this the map data needs to be generalized.

Map (or cartographic) generalization isn’t something new. Originally map makers did this manually. Instead of the detailed outline of a city, they just put a dot on the map; the many curves in a mountain road become a few larger curves; the details of fields and orchards and forests with their types of trees become just one homogeneous green area.

Today map generalization is, of course, often done automatically. And OSM maps have been using some form of generalization for years now. But not as much as we really need. The reason is that automatic generalization is difficult and often a very slow process. We need to crunch quite a lot of data and have to re-do this with every change. The algorithms involved are not widely known and there aren’t that many implementations. And for many types of features on the map we don’t even have an algorithm that works sufficiently well and sufficiently fast on real-world data.

The whole issue of generlization has become even more important in recent years with the switch from raster maps to vector based maps. Rendering thousands of forest polygons into a raster tile is slow, but it is possible. But adding all those forest polygons to a vector tile at a small zoom level would make the tile far too big to be usable.

Since September I am working on a project to add generalization support to osm2pgsql. The project is funded by German Federal Ministry of Education and Research (via the Prototype Fund). The financial support over six months allows me to spend a considerable amount of time on this project which wouldn’t otherwise be possible!

There are two parts to this project: The first part is thinking about generalization in general, looking at the different challenges when trying to generalize different types of data, finding existing algorithms and existing implementations and making them work in the context of OSM data. The second part is to fit everything into some kind of framework that can be implemented in or with osm2pgsql to make generalization easier and faster and, most crucially, supports updates. OSM data changes all the time and we want to generate generalized data as updates happen instead of only once per month or so. (This will probably not be possible for all types of generalization, but we’ll see how far we can get.)

Osm2pgsql is the perfect vehicle for this work, because it already interfaces to the PostgreSQL/PostGIS database which has many functions that are very useful for generalization. In fact most generalization these days is done by writing some SQL scripts that are run after a data import. So there is some prior work out there and most people creating maps have a trick or two up their sleeve how they do generalization. But there is no Open Source software (that I am aware of) that does that incrementally and for real-world data.

Creating good and fast generalization is a huge undertaking. This project can’t solve all problems and create a perfect map. But that’s not the goal. I want to get some ground work done in osm2pgsql, so that it can track changes and trigger further processing steps where needed. Some processing functions can be added to the osm2pgsql core, for other things I intend to use the power of PostGIS but hopefully make this functionality easier to access from osm2pgsql. When the project is done it should be easier to add your own generalization functions for specific feature types.

Ultimately the goal is to show that automatic generalization is possible, create more interest in the topic, spur some discussions and move the whole subject forward a little bit.

In further blog posts I will write some more about my work on this project. Stay tuned.

Tags: generalization · openstreetmap