Addresses
Overview
Note: This theme is currently in Alpha and we anticipate significant changes over the next several releases. We invite the Overture community to test the addresses schema and data and offer feedback via the data and schema repos.
Overture maintains nearly 215 million address point entities. The address theme is derived from a variety of sources, mostly distributed through OpenAddresses.
For licensing information, please see the attribution page.
Address data can be used for a variety of purposes, which can include:
- Mapping: Addresses may be displayed on the map for reference purposes.
- Geocoding: Addresses are a primary component of high-accuracy geocoding services (i.e. converting text for an address to a complete address with a location).
- Conflation: Addresses can be used to conflate to other data themes (e.g. places, buildings) where appropriate for mapping or other use cases (e.g. refining search).
- Standardization: Parsing an input address into address components based on an existing schema or address model.
- Normalization: Adhering to standard and consistent forms of address components.
- Validation and Verification: Confirming an address exists within a known list of addresses.
Overture address data, styled by country: countries with address coverage in color. |
Dataset description
Feature type descriptions
An address
is a feature type that represents a physical place through a series of attributes: street number, street name, unit, address_levels, postalcode and/or country. They also have a Point
geometry, which provides an approximate location of the position most commonly associated with the feature. We encourage you to consult the schema reference documentation for the address
feature type.
Counts, per country, of the address feature type
Country | Address Count |
---|---|
US | 78,078,341 |
MX | 30,278,896 |
FR | 26,041,499 |
CA | 16,457,790 |
AU | 15,608,317 |
NL | 9,750,341 |
CO | 7,786,046 |
PT | 5,911,139 |
CL | 4,042,071 |
DK | 3,920,253 |
FI | 3,897,449 |
CH | 3,147,964 |
AT | 2,492,155 |
NZ | 2,351,461 |
EE | 2,210,423 |
LT | 1,036,251 |
LU | 164,415 |
Data columns
The addresses GeoParquet file contains the following properties:
Schema for the GeoParquet files in the addresses theme
Property Name | Type | Description |
---|---|---|
id | string | A feature ID. This may be an ID associated with the Global Entity Reference System (GERS) if—and-only-if the feature represents an entity that is part of GERS. |
geometry | blob | A WKB representation of the entity's geometry - a Point, Polygon, MultiPolygon, or LineString. |
bbox | array | The bounding box of an entity's geometry, represented with float values, in a xmin, xmax, ymin, ymax format. |
country | string | ISO 3166-1 alpha-2 country code of the country or country-like entity, that this address represents or belongs to. |
postcode | string | The postcode for the address. |
street | string | The street name associated with this address. The street name can include the street "type" or street suffix, e.g., Main Street. Ideally this is fully spelled out and not abbreviated but we acknowledge that many address datasets abbreviate the street name so it is acceptable. |
number | string | The house number for this address. This field may not strictly be a number. Values such as "74B", "189 1/2", "208.5" are common as the number part of an address and they are not part of the "unit" of this address. |
unit | string | The suite/unit/apartment/floor number. |
address_levels | array | The administrative levels present in an address. The number of values in this list and their meaning is country-dependent. For example, in the United States we expect two values: the state and the municipality. In other countries there might be only one. Other countries could have three or more. The array is ordered with the highest levels first. |
version | integer | Version number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed. |
sources | array | The array of source information for the properties of a given feature, with each entry being a source object which lists the property in JSON Pointer notation and the dataset that specific value came from. All features must have a root level source which is the default source if a specific property's source is not specified. |
filename | string | Name of the S3 file being queried. |
theme | string | Name of the Overture theme being queried. |
type | string | Name of the Overture feature type being queried. |
Data access and retrieval
Overture's data themes, including addresses, are freely available on both Amazon S3 and Microsoft Azure Blob Storage at these locations:
Provider | Location |
---|---|
Amazon S3 | s3://overturemaps-us-west-2/release/ |
Azure Blob Storage | https://overturemapswestus2.blob.core.windows.net/release/ |
Overture distributes its datasets as GeoParquet, a column-oriented spatial data format that is a backwards-compatible extension of Apache Parquet. Parquet (and GeoParquet) is optimized for "cloud-native" queries, which means you can use many developer-friendly tools to efficiently fetch column "chunks" of cloud-hosted data. We encourage users who are new to GeoParquet to consult this guide.
The Getting Data section of this documentation offers instructions for using several tools to access Overture data, including DuckDB and Overture's Python command-line tool. See examples below for addresses.
We recommend querying and downloading only the Overture data you need. If you have a particular geographic area of interest, there are several options for using a simple bounding box to extract address data.
- DuckDB
- Python Command-line Tool
First, follow the setup guide for DuckDB.
DuckDB allows you to pass a bounding box in your query to select features in a specified geogrpahic area.
This example returns address data for Calgary, CA and the surrounding area:
LOAD spatial; --noqa
LOAD httpfs; --noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
SELECT
*
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
WHERE
bbox.xmin > -114.305
AND bbox.xmax < -113.784
AND bbox.ymin > 50.854
AND bbox.ymax < 51.219;
You can find Overture's official Python command-line tool here.
This tool helps to download Overture data within a region of interest and converts it to a few different file formats. In this example, a bounding box is passed to obtain all address data around Boston, MA:
overturemaps download --bbox=-71.068,42.353,-71.058,42.363 -f geojson --type=address -o boston.geojson
This command results in the following address points, displayed in QGIS:
Right now there is only one option to the overturemaps utility: download. It will download Overture Maps data with an optional bounding box into the specified file format. When specifying a bounding box, only the minimum data is transferred. The result is streamed out and can handle arbitrarily large bounding boxes.
Command-line options:
- --bbox (optional): west, south, east, north longitude and latitude coordinates. When omitted the entire dataset for the specified type will be downloaded
- -f (required: one of "geojson", "geojsonseq", "geoparquet"): output format
- --output/-o (optional): Location of output file. When omitted output will be written to stdout.
- --type/-t (required): The Overture map data type to be downloaded. Examples of types are building for building footprints, place for POI places data, etc. Run overturemaps download --help for the complete list of allowed types
This downloads data directly from Overture's S3 bucket without interacting with any other servers. By including bounding box extents on each row in the Overture distribution, the underlying Parquet readers use the Parquet summary statistics to download the minimum amount of data necessary to extract data from the desired region.
Data manipulation and analysis
Using this query, you can get a count of addresses per country:
Query
LOAD spatial; --noqa
LOAD httpfs; --noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
SELECT
count(*),
country
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
GROUP BY country;
This query will create a shapefile of address data in New Zealand, with limited attributes:
Query
LOAD spatial; --noqa
LOAD httpfs; --noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
COPY (
SELECT
id,
number,
street,
unit,
postcode,
geometry -- DuckDB v.1.1.0 will autoload this as a `geometry` type
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
WHERE
country = 'NZ'
)
TO
'NZaddresses.shp'
WITH (
FORMAT GDAL,
DRIVER 'ESRI Shapefile',
SRS 'EPSG:4326'
);
This query will create a CSV file of address within the State of Utah, using the divisions
theme data in a spatial query:
Query
INSTALL spatial; -- noqa
LOAD spatial; -- noqa
-- Access the data on AWS in this example
SET s3_region='us-west-2';
COPY (
-- Create a temp table with the state of Utah
WITH utah AS (
SELECT
id AS utah_id,
geometry AS utah_geom -- DuckDB v.1.1.0 will autoload this as a `geometry` type
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-09-18.0/theme=divisions/type=division_area/*', filename=true, hive_partitioning=1)
WHERE
id = '085022383fffffff0167572d4665d6f9'
),
-- Use the geometry of Utah to filter addresses within the state's boundary
addresses AS (
SELECT
*,
geometry -- DuckDB v.1.1.0 will autoload this as a `geometry` type
FROM
read_parquet('s3://overturemaps-us-west-2/release/__OVERTURE_RELEASE/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
INNER JOIN
utah
ON ST_WITHIN(geometry, utah.utah_geom)
WHERE
country = 'US'
)
-- Export the places selection to a CSV file
SELECT
id,
street,
number,
unit
FROM
addresses
)
TO
'utah_addresses.csv';
Revision history
Version info
You can find the most recent release notes here.
Support
Feedback
You can find a list of Overture repositories here.
Discussions are generally reserved for broader conversations around the addresses project as a whole (supporting a new workflow, adding a dataset, null attributes).
Issues are generally reserved for more specific concerns with specific entities in the dataset (geometry validation, missing entities, duplicate entities) or country-specific concerns.
Discussions
You can start and add to discussions in each of the public Overture repositories. Some examples:
- General Overture Discussions: https://github.com/orgs/OvertureMaps/discussions
- Data Discussions: https://github.com/OvertureMaps/data/discussions
- Schema Discussions: https://github.com/OvertureMaps/schema/discussions
Discussions around Overture's address data should be filed in the Data repository.
Issues
You can start and add to issues in each of the public Overture repositories, too. Some examples:
- Data Issues: https://github.com/OvertureMaps/data/issues
- Schema Issues: https://github.com/OvertureMaps/schema/issues
- Tiles issues: https://github.com/OvertureMaps/overture-tiles/issues
Issues around Overture's address data should be filed in the data repository.