[JT] Jochen Topf's Blog
Wed 2017-08-30 13:25

Osmium releases

Recently I released new versions of the Osmium library (libosmium version 2.13) and the Osmium command line tool (version 1.7) with some major improvements I want to talk about.

Deprecated old-style multipolygons

Old-style multipolygons (with the tags on the outer ways instead of on the relations) are basically gone. (I wrote about this in my last blog post.) So, by default, the new Osmium version doesn’t handle old-style multipolygons any more. The data is interpreted according to the current tagging scheme. All tags on the relation stand for the whole multipolygon, tags on member ways stand for those ways only.

If you need old-style multipolygon support you can still use the legacy code. But the only reason I can think of where this makes sense is when you are working with historical data from OSM.

If you have programs using the multipolygon code in libosmium, you should be able to just recompile and get the new behaviour.

Relation handling

Triggered by the changes in the multipolygon handling and some long-standing issues with the existing relation handling, I made a lot of changes under the hood to make handling of OSM relations more flexible.

Instead of the relations::Collector class there is now the new RelationManager class doing more or less the same, but with a different interface here or there for easier use. This class can be the basis for handling any kind of relation type (such as route relations or turn restrictions).

If the RelationManager class is not flexible enough for what you need doing, you can use the new ItemStash, RelationsDatabase, and MembersDatabase classes which help you keep OSM objects in memory and keep track of which member belongs to which relation.

Flexible index type for node location store

Assembling way geometries or multipolygons efficiently from their constituent parts needs some way of storing the node locations in memory and making them available to the ways and relations which only contain the IDs of their member nodes, but don’t have a location by themselves.

Osmium has long been offering index classes to do this for you, but there was always the problem that you had to use a different index class depending on whether you were working with small extracts or the whole planet. This is confusing for inexperienced users who often choose the wrong one and see bad performance or run out of memory. The new Osmium version has a new flex_mem index, which is autosizing. It doesn’t matter which size the input file is, the code will automatically choose the index strategy it thinks is best for your data. There is a little bit of overhead in some cases, so if you know what you are doing (and have benchmarked your use case), you can still use the old indexes. But I recommend the new one as the default now for all cases.

Note that in any case the flex_mem index will store all data in memory which can need up to 40 GByte with current OSM data. I am thinking about having an index type that will go to disk if necessary, but that needs some more thought and carefull benchmarking.

Export function in command line tool

There are some minor changes to the command line tool, too. But the interesting addition is an export command which allows exporting OSM data into GeoJSON format.

The OSM data model with its nodes, ways, and relations is very different from the data model usually used for geodata with features having point, linestring, or polygon geometries (or their cousins, the multipoint, multilinestring, or multipolygon geometries).

The export command transforms OSM data into a more common GIS data model. Nodes will be translated into points, and ways into linestrings or polygons (if they are closed ways). Multipolygon and boundary relations will be translated into multipolygons. This transformation is not loss-less, especially information in non-multipolygon, non-boundary relations is lost.

All tags are preserved in this process. Note that most GIS formats (such as Shapefiles, etc.) do not support arbitrary tags. Transformation into other GIS formats will need extra steps mapping tags to a limited list of attributes. This is outside the scope of this command.

I am thinking about offering more export formats, but GeoJSON is the only common one that allows any number of tags on a feature. Let me know if you have any ideas here.

Tags: geojson · openstreetmap · osmium