Transportation Guide
Overview
The Overture transportation theme is the collection of LineString and Point features that describe the infrastructure and conventions of how people and objects travel around the world. The dataset contains two features types: connector
and segment
. The three subtypes within segment
-- road
, rail
, and water
-- contain familiar categories of transportation data: highways, footways, cycleways, railways, ferry routes, and public transportation.
Most of the data in the transportation theme is sourced from OpenStreetMap. In the 2024-09-18.0
release, we began adding data from TomTom to improve coverage in key areas.
You might use the Overture transportation data for:
- mapping: rendering a map of connected roads and paths.
- routing: calculating optimal routes from place to place.
- navigation: generating granular instructions on the maneuvers needed to follow a route.
- analytics: transportation-related analysis including traffic safety analysis and disaster planning.
- geocoding: getting the coordinates of street intersections (geocodes) or the street intersection near specific coordinates (reverse geocodes).
This guide is an overview of the transportation data. To dig into the details of the schema, see the schema concepts for transportation and the reference documentation for the segment and connector feature types.
Dataset description
All Overture data, including transportation data, is distributed as GeoParquet, a column-based data structure. Below you'll find a table with column-by-column descriptions of the properties in the transportation feature type.
Schema for GeoParquet files in the transportation theme
- segment
- connector
column_name | column_type | Description |
---|---|---|
id | VARCHAR | A feature ID. This may be an ID associated with the Global Entity Reference System (GERS) if—and-only-if the feature represents an entity that is part of GERS. |
geometry | WKB | The line representation of the segment's location. Segment's geometry which MUST be a LineSting as defined by GeoJSON schema. |
bbox | STRUCT | Area defined by two longitudes and two latitudes: latitude is a decimal number between -90.0 and 90.0; longitude is a decimal number between -180.0 and 180.0. |
version | INTEGER | Version number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed. |
sources | STRUCT[] | The array of source information for the properties of a given feature, with each entry being a source object which lists the property in JSON Pointer notation and the dataset that specific value came from. All features must have a root level source which is the default source if a specific property's source is not specified. |
subtype | VARCHAR | The broad category of transportation segment. |
class | VARCHAR | Captures the kind of road and its position in the road network hierarchy. |
subclass | VARCHAR | Specifies the usage of a length of road. |
subclass_rules | STRUCT[] | Defines the portion of a road that the subclass applies to. |
names | STRUCT[] | Properties defining the names of a feature. |
connectors | STRUCT[] | Array of connector IDs identifying the connectors this segment is physically connected to linearly referenced with their location. Each connector is a possible routing decision point, meaning it defines a place along the segment in which there is possibility to transition to other segments which share the same connector. |
routes | STRUCT[] | Routes this segment belongs to. |
access_restrictions | STRUCT[] | Rules governing access to this road segment or lane. |
level_rules | STRUCT[] | Defines the Z-order, i.e. stacking order, of the road segment. |
prohibited_transitions | STRUCT[] | Defines where traveling from the segment to another is disallowed for navigation. This covers things situations prohibited turns or a transition from road to bike lane disallowing cars. |
road_surface | STRUCT[] | Defines the surface material on a road such as paved, asphalt, or unpaved. |
road_flags | STRUCT[] | Additional properties relevant to roads such as is_bridge or is_under_construction. |
speed_limits | STRUCT[] | Defines the speed limit of the road segment. |
width_rules | STRUCT[] | Defines the width of the road segment for rendering. |
destinations | STRUCT[] | Describes the transitions from one segment to another on the way to a specified location. This data is primarily used for routing. |
column_name | column_type | Description |
---|---|---|
id | VARCHAR | A feature ID. This may be an ID associated with the Global Entity Reference System (GERS) if—and-only-if the feature represents an entity that is part of GERS. |
geometry | BLOB | The line representation of the segment's location. Segment's geometry which MUST be a LineSting as defined by GeoJSON schema. |
bbox | STRUCT | Area defined by two longitudes and two latitudes: latitude is a decimal number between -90.0 and 90.0; longitude is a decimal number between -180.0 and 180.0. |
version | INTEGER | Version number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed. |
sources | STRUCT[] | The array of source information for the properties of a given feature, with each entry being a source object which lists the property in JSON Pointer notation and the dataset that specific value came from. All features must have a root level source which is the default source if a specific property's source is not specified. |
Subtypes
Transportation segments are divided into three subtypes: rail, water, and road. The road subtype is then further divided into a variety of different classes based on usage captured in the table below.
Class and subclass feature counts
- classes
subtype | class | subclass | Feature count, November 2024 release |
---|---|---|---|
rail | 1,572,582 | ||
road | bridleway | 95,598 | |
road | cycleway | cycle_crossing | 47,189 |
road | cycleway | 1,198,943 | |
road | footway | crosswalk | 1,839,806 |
road | footway | sidewalk | 2,961,562 |
road | footway | 14,555,896 | |
road | living_street | 3,053,898 | |
road | motorway | link | 626,752 |
road | motorway | 428,359 | |
road | path | 12,727,710 | |
road | pedestrian | 448,737 | |
road | primary | link | 470,293 |
road | primary | 6,511,508 | |
road | residential | 123,774,708 | |
road | secondary | link | 369,680 |
road | secondary | 10,438,045 | |
road | service | alley | 1,541,197 |
road | service | driveway | 15,160,138 |
road | service | parking_aisle | 5,978,042 |
road | service | 31,689,642 | |
road | steps | 1,691,848 | |
road | tertiary | link | 283,157 |
road | tertiary | 19,199,059 | |
road | track | 23,956,033 | |
road | trunk | link | 535,332 |
road | trunk | 3,342,592 | |
road | unclassified | 28,763,594 | |
road | unknown | 546,181 | |
water | 27,337 |
Data access and retrieval
The latest transportation data can be obtained from AWS or Azure as GeoParquet files at the following locations.
- Segment
- Connector
Provider | Location |
---|---|
Amazon S3 |
|
Azure Blob Storage |
|
Provider | Location |
---|---|
Amazon S3 |
|
Azure Blob Storage |
|
Data usage guidelines
We recommend downloading only the Overture data you need. If you have a particular geographic area of interest, there are several options for using a simple bounding box to extract places data and output a GeoJSON file.
- Python Command-line Tool
- DuckDB
First, follow the setup guide for the Python Command-line Tool.
Set type to either segment
or connector
and simply alter the bbox
value to download a particular area.
overturemaps download --bbox=12.46,41.89,12.48,41.91 -f geojson --type=segment -o rome_segments.geojson
First, follow the setup guide for DuckDB.
Set the parquet link to either the connector or segment url depending on your needs.
Replace the bbox.xmin
and bbox.ymin
values with a new bounding box to run the query for a different area.
LOAD spatial; -- noqa
LOAD httpfs; -- noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
COPY (
SELECT
*
FROM read_parquet('s3://overturemaps-us-west-2/release/2024-11-13.0/theme=transportation/type=segment/*')
WHERE
bbox.xmin > 12.46 AND bbox.xmax < 12.48 AND
bbox.ymin > 41.89 AND bbox.ymax < 41.91
)
TO 'rome_segments.parquet';
Data manipulation and analysis
Querying by properties in DuckDB
These examples use data properties to filter the data in useful ways using DuckDB.
- Query by class
- Query by routes
The class
column can be used to pull out subsets of the road data. Similarly, you could use subtype to select only water, rail, or road features. This example extracts only the parking_aisle features within the bounding box.
LOAD spatial; -- noqa
LOAD httpfs; -- noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
COPY (
SELECT
*
FROM read_parquet('s3://overturemaps-us-west-2/release/2024-11-13.0/theme=transportation/type=segment/*')
WHERE
class = 'parking_aisle' AND
bbox.xmin > 13.0897 AND bbox.xmax < 13.6976 AND
bbox.ymin > 52.3100 AND bbox.ymax < 52.7086
)
TO 'berlin_parking_aisles.parquet';
You might be interested in a network of roads, such as a US Interstate. These can be extracted using the routes
column and either using the network
and ref
properties and/or the wikidata
column to identify the route.
This example extracts all the roads that are part of US I-5. To get all US Interstates simply remove AND routes[1].ref = '5'
from the query.
LOAD spatial; -- noqa
LOAD httpfs; -- noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
COPY (
SELECT
*
FROM read_parquet('s3://overturemaps-us-west-2/release/2024-11-13.0/theme=transportation/type=segment/*')
WHERE
routes[1].network = 'US:I'
AND routes[1].ref = '5'
)
TO 'US_I_5.parquet';
Querying by properties in Athena
Athena can allow for faster querying of the transportation layer than DuckDB given the size of the data. These examples are designed for Athena, but could be reworked for DuckDB with some tweaking.
- Query by speed limit
- Select connecting segments
To properly return a linear referenced feature like a speed limit, we will need to query all the possible values of the feature as the queried value may only exist on one portion of the line. In this example, we're extracting roads with any speed limit max_speed
value of 27 and unit of mph using the any_match function.
This same general query would also work for querying other similar columns such as prohibited_transitions
and access_restrictions
.
SELECT
id,
speed_limits,
ST_GEOMFROMBINARY(geometry) AS geometry
FROM v2024_11_13_0
WHERE type = 'segment'
AND ANY_MATCH(
speed_limits,
speed_limit->speed_limit.max_speed.value = 27
AND speed_limit.max_speed.unit = 'mph'
)
With the connectors
column it is simple to query for all features that connect with a particular segment without the need for a spatial query.
This example selects all the segments that that connect to the example id.
WITH input AS (
SELECT id AS input_id,
connector_id
FROM v2024_11_13_0
CROSS JOIN UNNEST(connectors) AS t(connector)
WHERE type = 'segment'
AND id = '08628d5437ffffff0473ffc36df547db'
)
SELECT
*,
ST_GEOMFROMBINARY(geometry) AS geometry
FROM __ATHENA_OVERTURE_RELEASE,
input
WHERE type = 'segment'
AND id != input_id
AND ANY_MATCH(
connectors,
connector->connector.connector_id = input.connector_id
)
Tools and libraries
transportation-splitter
Conceptual diagram of the splitter tool output. The numbers following 1234@ represent start_lr and end_lr values. |
The transportation-spitter tool transforms Overture road data into simpler sub segments. It will optionally divide features at each connector point and at each change of a scoped based property, depending on configuration. Depending on your needs and map stack, the resulting dataset may be easier to manipulate than the original Overture data as each segment will only have connections at either end and have one set of properties for its entire length.
Since a GERS ID will no longer be unique with this output, the resulting data will have two additional columns: start_lr
and end_lr
which are linear references describing which section of the orginal feature this new segment comes from.
Splitter Example
To help visualize this process better, here is a real world example of a residential street in OpenStreetMap, Overture, and after being run through the splitter tool.
- OpenStreetMap
- Overture
- Transportation Splitter
In OpenStreetMap this residential road is represented by two different features with the same tags with feature 1 having an additional restricted access tag. |
In Overture the two segments have been combined into one feature and the restricted access tag has been stored as this linear reference in access_restrictions: |
[{'access_type': allowed, 'when': {'during': NULL, 'heading': NULL, 'using': NULL, 'recognized': [as_private], 'mode': NULL, 'vehicle': NULL}, 'between': [0.521962729, 1.0]}] |
The splitter has sliced the Overture feature at each connector point for the driveways as well as at the point where the access restriction begins. This results in six unique features in the output all still sharing the same GERS ID. |
More Information and Feedback
The tool requires a Spark environment to run and has been tested using Azure Databricks and AWS Glue. For set up information the transportation-spitter GitHub will contain the most up to date information as the tool is in active development still.
If you have feedback, questions, etc. on the tool you can create an issue on the GitHub.