[JT] Jochen Topf's Blog
Fri 2014-09-19 00:17

Taginfo Integrates More Data Sources

Taginfo has the “mission” of aggregating all available information about OSM tag usage and present it to the users. The main sources of information are the OSM database and the documentation in the OSM wiki. But there are many more sources of information to be tapped. Everybody who creates a map, or writes presets for an editor, everybody writing a data export program or a routing application has to decide which tags to use in what way. All this is useful information, but it is hard to come by. Everybody uses different programming languages and different config file formats. And all this “stuff” is spread out over many software repositories and web sites.

For a long time my plan was to somehow collect this information and bring it into taginfo. But it is a daunting task and I only came as far as parsing JOSM style config files and bringing the result into taginfo. I realized over time that this approach doesn’t scale. I can’t parse everybodies config file (or look at the source code of all these projects) to see which tags are used and how. And I can’t keep up with the changes in all those projects.

So I changed the approach: Instead of taginfo having to understand all the different formats, I created one simple format in which all projects can report to taginfo which tags they are using. The projects generate either automatically or manually (or both) a .json file that taginfo pulls in daily and integrates into its database. This way the work is distributed over more shoulders. I still have to integrate this data and make it show up in taginfo, but others do the work of getting the data from their project into the common format.

The format of these project files is quite simple: There is a header with information about the project itself, such as a name, a description, URLs of the project, its documentation, and an icon. And then there is a list of keys and tags with an optional description, icon, and a URL with further documentation. This cannot capture every information we might want, but it is a start. We’ll see over time what else we need and how best to extend this format. We’ll probably want some kind of grouping or keyword feature, for instance, so that you can see, say, all tags used by the different routing engines.

But for now we start with the simple approach. Today I have launched the new version of the taginfo software. To try this out head over to the projects page which lists all the projects that already supply (some of) their tag usage. Or go to any key or tag page and choose the “Projects” tab (examples: maxspeed, route=bicycle).

In the last weeks I have asked a few people responsible for a range of quite different projects to supply information about their tag usage. Many thanks to those beta testers that helped me get this of the ground: John Firebaugh of iD, the JOSM maintainers, Sarah Hoffman (who supplied the Nominatim and Waymarked Trails data), Dennis Luxen (OSRM), Frederik Ramm (OSM Inspector), and Simon Poole (Vespucci). I myself supplied the data for OSMCoastline and, most importantly, for the OpenStreetMap Cheat Mug.

Of course this is only the beginning. There are a lot more projects out there, lots of programs and lots of maps that use OSM tags. If you are the maintainer of one of those projects, consider creating a projects file and submitting it to taginfo. The documentation of the format and instructions for submitting are on the wiki.

Tags: openstreetmap · taginfo