In Choosing a Language I have proposed to think about how to render labels in different languages into the tiles on the fly when they are requested.
Our tile rendering process is too slow to render the whole tile on demand. Tiles sometimes take several hundreds of seconds to render. It would be interesting to optimize the whole process so the whole tile can be rendered on demand, but thats a bit out of the scope for the multilingual maps project and too difficult to achieve for the moment. But it might just be possible to pre-render base map tiles and then overlay labels on top whenever a tile is accessed.
Why might this problem be solvable but not the pre-rendering of the whole tile? The difference is in the amount of data to be taken into account. Most OSM objects do not have a name. It takes a while, for instance, to get all building polygons out of the database and render them, but buildings usually don’t have names, so I don’t have to take them into account for the labelling. There are only about 35 million names in the OSM database (with name tag, int_name, alt_name, and the many name:language tags). Thats not much compared to the over 1.6 billion objects in the database. On the other hand rendering labels is relatively expensive because they are not allowed to overlap.
I see two different ways this could be handled: We could either hold all the information needed for the labels in a separate database and whenever a tile is accessed get this information out and render it on top of the tile. Or we could keep the label information together with the pre-rendered base tile. So the pre-rendering process decides which labels might be needed for which tile and store this information as vector data together with the tile.
The database for the first approach would have to fit into memory, otherwise it would never be fast enough. We need to deliver thousands of tiles per second, so this process has to be really fast. How would this database be structured? The simplest way would be to use the same structure as the main rendering database, but leave out everything that doesn’t have any names. This way it is easy to use the existing map styles, they just have to be split in two, one for the pre-render, one for the label overlay. The database would be updated together with the main rendering database whenever the OSM data changes. This still could lead to some problems, because the data in the pre-rendered tile might not fit with the labels rendered later.
In the second approach the job of getting the data out of the database and deciding where to render a label can already be done in the pre-render step. But instead of doing the actual rendering, the data is stored in some format together with the base map image. Maybe the Mapnik Metawriter could be used for this, but this approach might need some large changes in Mapnik. An advantage is that we always have data that fits together, because it is read from the rendering database at the same time. And because the tile has to be read from disk anway it is not much more expensive to read a bit more data. No extra database is needed.
There is one problem with both those approaches. Normally when rendering tiles, POI icons, labels, and shields should never overlap. If icons and shields are rendered into a bitmap and we add the labels later we can not make sure that there will be no overlap. One possible solution would be to somehow keep the information where there are icons and make sure we do not render into them. But this could lead to an important label not being rendered because all space is taken up by unimportant icons. Another option would be to also render icons and shields on-demand together with the labels.
Another complicating factor is the use of metatiles. Usually on tile servers the 256×256 pixel tiles are not rendered alone but in groups of (typically) 8×8 tiles. Such a group is called a metatile. There are several reasons for doing this: It is faster to render one large image than many small ones and it is more efficient in storage, too. But in this context the most important reason is the label placement. If the area rendered in one go is larger, it is easier to place labels in good positions and the maps will look better. So we somehow have to do this on-demand-rendering on the metatile and not the tile. This means we need some kind of short-term caching of the metatiles with the rendered labels and make sure all the requests coming in shortly after another for nearby tiles (and therefore the same metatile) end up finding this metatile.
Comments can be directed to the Multilingual maps wiki page.