The last few days saw releases of my Osmium library (2.15.0), the Python bindings for Osmium (PyOsmium, 2.15.0), the Osmium command line tool (1.10.0), and the OSMCoastline (2.2.1) program. In this blog post I want to highlight some of the changes.
As usual the Osmium library underlying all this saw some minor fixes and a little bit of added functionality. But there is one important change: It uses a lot less memory now when reading files. The new code is better at filling buffers to capacity, so there is less empty space in the buffers. For XML or OPL files the changes are not that big, but for PBF files, it can amount to 50% space savings! In typical programs this doesn’t matter all that much, because they need a lot of memory for other stuff, too. But there are some use cases where overall memory use is almost half of what it was before.
PyOsmium has, of course, all the goodies from libosmium. And it saw its own big change: Instead of building on Boost::Python, it uses PyBind11 internally to connect the C++ Osmium library with the Python world. This makes the code much cleaner and simpler and will make PyOsmium easier to extend in the future. Thanks, Sarah, for your work on PyOsmium!
The Osmium command line tool got some interesting new options. The sort command has a new option to do the sorting in three passes instead of one, taking a bit more time but reducing its memory use. The tags-filter command now has a new option --remove-tags/-t. When this option is used, all tags from objects are removed if those objects are not matching the filter, but are only included as referenced objects. So if you are filtering for w/highway you get the way objects with a highway tags plus all nodes needed for those ways, but all tags from those nodes are removed.
Osmium has supported creating geographical extracts of OSM data for a while, used for instance when you are interested only in the data for one country and not the whole planet. It is not always clear what is supposed to happen with data straddling the boundary of the region you are cutting out, especially when there are relations referencing objects inside and outside. There are several “strategies” implemented in Osmium that have different results. In the new Osmium version the “smart” strategy did get a new option “complete-partial-relations=X”. When this is used, all relations that have X percent or more of their members already in the extract are included completely in the extract. This can be useful, for instance, to make sure that the boundary relation of a country is always in the extract, even if a slightly wrong cutting polygon was used that left some of it outside. The way this had to be done before (with the “types=boundary” option) gave you the complete boundary relation but also included all boundary relations of neighboring countries as well. Of course this is not a perfect measure, we will see how well it works in practice.
The last major addition in Osmium is the new “pg” export format. It allows exporting OSM data in the format used by the PostgreSQL COPY command. Reading this data into a PostgreSQL/PostGIS database is easy then. Just a CREATE TABLE and a COPY command and you are done. This is not intended as a replacement for osm2pgsql or other “real” database import tools. It isn’t flexible enough (no tag transformations, no support for different coordinate systems, …) and, crucially it doesn’t support updates. But for simple projects it can be quite useful, especially when you have already filtered the data with osmium tags-filter or so.
Last but not least, OSMCoastline got some bug fixes for obscure corner cases, most of them never happened in real data, but one actually crashed OSMCoastline recently after somebody added some broken coastline data.
As usual, packages for some Linux distributions and Homebrew are already available or will be available shortly. I’d love to get some feedback from you, how these new features (or old ones for that matter) work for you and what else you want to see in the Osmium suite of software.
Tags: openstreetmap · osmium