Exploring our beta release
Last week Overture Maps announced the beta release of our schema and data. After months of hard work and steady improvements, we are nearing production-level stability. In a series of posts over the next few weeks -- starting with this one -- we’ll unpack the highlights and improvements you'll see in this release and beyond.
Making geospatial fast
The first thing you'll notice is how much faster you can pull Overture Maps data out of the cloud. We made this possible by working closely with open-source collaborators on the repartitioning and spatial optimization of our GeoParquet files. Meanwhile, our friends at DuckDB, one of our favorite SQL tools, added a feature that improves the performance of queries with bounding boxes.
Query speeds have improved significantly from our February release to the April beta release. This example compares DuckDB queries for buildings in Philadelphia. See here for more information about our performance testing.
- 2024-02-15-alpha.0 release
- 2024-04-16-beta.0 release
SELECT
*
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-02-15-alpha.0/theme=buildings/type=building/*', filename=true, hive_partitioning=1)
WHERE
bbox.minx > -75.60154
AND bbox.maxx < -74.955763
AND bbox.miny > 39.867004
AND bbox.maxy < 40.137992;
820,915 buildings
~120s
SELECT
*
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=building/*', filename=true, hive_partitioning=1)
WHERE
bbox.xmin > -75.60154
AND bbox.xmax < -74.955763
AND bbox.ymin > 39.867004
AND bbox.ymax < 40.137992
852,487 buildings
~25s
We're continuing to make things faster and easier for users. Along with the folks at Development Seed, an Overture Maps Foundation member, we're building special tools for Overture Maps data on top of lonboard, their Python library for visualizing large geospatial datasets in Jupyter. And recently our friends at Wherobots took a comprehensive look at how our use of GeoParquet makes querying and analyzing our data with Apache Sedona very efficient.
As you can see, we're moving forward with the community to iterate on data, software, and specifications with the shared goal of making geospatial fast.
Easier-to-use schema
Another highlight of the beta release is the transition to an easier-to-use schema for our administrative boundary data. We first explored this idea with the Overture Maps community in February, and after two short months of work, the new divisions schema and data are ready to go. Here's a query to divisions
that grabs geometries for all the countries in the world:
SELECT *
FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=divisions/type=division_area/*', filename=true, hive_partitioning=1)
WHERE subtype = 'country';
You can see that the divisions
query above is much simpler than a comparable query to admins
:
WITH admins AS (
SELECT *
FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=admins/type=*/*', filename=true, hive_partitioning=1)
)
SELECT *
FROM admins country_locality
JOIN admins country_area
ON country_area.locality_id = country_locality.id
WHERE country_locality.locality_type = 'country';
We plan to deprecate admins by the July release. In the meantime, both admins
and divisions
will be available to users.
Bridges, islands, waterfalls, and more!
We added more rich detail to our base
layer in this release, including an infrastructure
type with familiar features from Facebook’s Daylight map distribution. We also added new subtypes and classes for the land
, land_use
, and water
feature types. You'll find a comprehensive listing of the subtypes and classes for each feature type in our schema reference docs. Ready to make your own map? We have a tutorial to help you get started.
Stay tuned for more highlights
We'll be back soon with more posts that explore our path from the beta release to production. In the meantime, we invite you to get started with our data and share with us your comments and feedback.