Tools

The good folks at Heigit have released ohsome-planet, a handy tool to turn OpenStreetMap history data from PBF into GeoParquet files, ready to use in common GIS applications.

Working with raw OSM data presents several challenges due to its complex structure. Typically, users require data that is readily compatible with Geographic Information System (GIS) applications. Our new tool streamlines this process, providing a structured and GIS-ready dataset for improved usability.

The tool also enriches OSM element data by integrating information from OSM changesets and administrative boundaries. This additional contextual data allows for more efficient and straightforward spatial analysis, further improving the utility of OSM datasets.

The tool is written in Java and you have to build it yourself; a small price to pay for more easily accessible free and open data.

DuckDB is a database management system for data analytics that has picked up steam in the recent months within the geospatial data community.

Chris Holmes documents his experiences with DuckDB and explores its potential for work with geospatial data:

I’m not the type who’s constantly jumping to new technologies and generally didn’t think that anything about a database could really impress me. But DuckDB somehow has become one of the pieces of technology – I gush about it to anyone who could possibly benefit.

There’s a lot to love about DuckDB: Its performance, its early support support for geospatial queries and data formats (although not fully mature), and smart extensions to SQL.

What sets it apart is DuckDB’s support for cloud-native operations via the httpfs extension and Parquet:

DuckDB also has the ability to work in a completely Cloud-Native Geospatial manner – you can treat remote files just like they’re on disk, and DuckDB will use range requests to optimize querying them.

In geospatial, we’re often dealing large data sets that are har to store and explore with traditional tools, unless you import the data into a database. But it’s not just going to make data access and processing easier on your computer. There’s also a great potential to move more large-data processing into the browser:

It’s not just that DuckDB is a great command-line and Python tool, but there’s also a brewing revolution with WASM, to run more and more powerful applications in the browser. DuckDB is easily run as WASM, so you can imagine a new class of analytic geospatial applications that are entirely in the browser.

Gregor MacLennan with a nice overview of the current state of Mapeo and what we can expect from future releases.

We developed Mapeo over 8 years through a co-design process with local partners, and have learned a huge amount about the challenges and opportunities of peer-to-peer technology along the way. This post shares some technical details about these challenges and how the solutions are guiding our work on “Mapeo Next”.

I’ve always admired the work of Digital Democracy, especially on Mapeo. They’ve built a great product for collaborative and participatory mapping, something we didn’t quite pull off when I worked at ExCiteS and Cadasta.

Where can you get within 40 minutes from every subway station in New York? Chris Whong’s fun, interactive map shows you, using GTFS data from New York’s Metropolitan Transportation Authority and Turf.js to calculate the isochrones.

Jess Beutler and Alan McConchie:

This year, OpenStreetMap US stepped forward to become a steward of Field Papers for the community going forward. The transition makes sense; not only is the tool used extensively by the mapping community globally and in the US, it is also used a great deal by educators through OpenStreetMap US’s TeachOSM program and other education initiatives

While it will now be under the umbrella of OpenStreetMap US, Field Papers will be maintained as a global tool available for mappers around the world. In the next year, OpenStreetMap US will be working to develop a plan for maintenance and development that pulls in the knowledge and skills of the volunteer community, as well as expanding the financial resources available to the project.

So far, Placemark, a bootstrapped one-person project, was only available for paying customers. Now Tom released a free tier, Placemark Play, which provides the same user interface and similar features as the paid tier. The main difference to the paid tier is that Placemark won’t save your data; once your browser session ends, your data is gone.

Without data storage, Placemark Play can be compared to Geojson.io, which handles data persistence similarly, although Geojson.io offers to restore data from the last session. Placemark, however, provides a slicker user interface and more advanced geometry operations (buffers, simplification, convex hulls), imports and exports from and to various geospatial data formats, and design and export map styles for MapboxGL and Leaflet.

Placemark Play is a great option for quick data visualisation and advanced editing, when you need a little more capabilities than Geojson.io.

Mapstack

Mapstack, launched this week, aims to become a central catalogue for open data, a place where you discover and access datasets to fulfil your geo-data needs:

mapstack will do for open map data what GitHub did for open source, by bringing all of the world’s open maps together in one place and making them easy to discover, easy to access and easy to use.

Most of Mapstack’s functionality is currently centred around creating datasets and providing appropriate descriptions for the data. Setting up a new map involves several steps, including creating a new workspace or team, adding members, and providing a description.

Then you proceed to create the actual dataset. Upload your data, currently limited to GeoJSON and files smaller than 50MB. Then select the fields to keep and provide human-readable names. A downside is that you can’t skip this step. You have to go through each chosen field and individually confirm the label. To finalise, provide more information about the nature of the data, its geographic area, and the feature type, which creates an editable name for your new dataset.

That’s a lot of steps before you can view your dataset for the first time. Much of the information can be done after the project is set up. With the goal of discoverability in mind, however, and considering how badly many datasets are missing meta-data, you could say it’s smart design to force users to provide context.

Once the map is created, the features are limited: You can browse the data, view the attribute of features, and apply filters. There’s an attribute table, which is only available for filtered results, but not for the unfiltered data. Mapstack focuses on hosting data and making the data discoverable rather than on interacting, editing or visualising data.

As such, Mapstack is not a competitor of Felt or Placemark, two products released last year that aim to modernise how we do GIS on the Web. Mapstack complements both, and GIS tooling in general, by providing the data for the tools.

Will it take off? I’m not sure. The marketing copy draws comparisons to GitHub, but there are differences. GitHub became successful because it built on a protocol that developers already used and provided a product for collaboration around the protocol. GitHub added value to the developer’s daily work, so a lot of code ended up on the platform.

Mapstack doesn’t tie in with existing tools. Currently, there is no tooling to create or manage data, collaborate or visualise the data. It’s a place where the result of data processing might be hosted. Open data providers have invested in the infrastructure to host data—it’ll be hard to convince them to migrate to Mapstack instead.

Seek-Optimized ZIP (SOZip)

Seek-Optimized ZIP, a new profile for ZIP files, allows random access and selective decompression. With standard ZIP files, you have to download and decompress the ZIP file before accessing its content. While fully compatible with standard ZIP tools, with SOZip, you can now selectively access files within a ZIP, so you won’t have to download the full archive if you want to access just one file.

Currently, there are two implementations for SOZip: It’s available in the development branch GDAL and as a Python module. MapServer (on the development branch) and QGIS, both applications depending on GDAL, support SOZip too.

Seek-Optimized ZIP file adds to a growing suite of cloud-native data formats and APIs, such as COGs, Zarr or GeoParquet, allowing developers and applications to access and process large selectively without the need to download complete datasets.

Related Links