Skip to main content

Addresses

Overview

Note: This theme is currently in Alpha and we anticipate significant changes over the next several releases. We invite the Overture community to test the addresses schema and data and offer feedback via the data and schema repos.

Overture maintains just over 205 million address point entities. The address theme is derived from a variety of sources, mostly distributed through OpenAddresses.

For licensing information, please see the attribution page.

Address data can be used for a variety of purposes, which can include:

  • Mapping: Addresses may be displayed on the map for reference purposes.
  • Geocoding: Addresses are a primary component of high-accuracy geocoding services (i.e. converting text for an address to a complete address with a location).
  • Conflation: Addresses can be used to conflate to other data themes (e.g. places, buildings) where appropriate for mapping or other use cases (e.g. refining search).
  • Standardization: Parsing an input address into address components based on an existing schema or address model.
  • Normalization: Adhering to standard and consistent forms of address components.
  • Validation and Verification: Confirming an address exists within a known list of addresses.
Overture address coverage
Overture address data, styled by country: countries with address coverage in color.

Dataset description

Feature type descriptions

An address is a feature type that represents a physical place through a series of attributes: street number, street name, unit, address_levels, postalcode and/or country. They also have a Point geometry, which provides an approximate location of the position most commonly associated with the feature. We encourage you to consult the schema reference documentation for the address feature type.

Counts, per country, of the address feature type
CountryAddress Count
US78,078,341
MX30,278,896
FR26,041,499
CA16,457,790
AU15,608,317
NL9,750,341
CO7,786,046
PT5,911,139
DK3,920,253
CH3,147,964
AT2,492,155
NZ2,351,461
EE2,210,423
LT1,036,251
LU164,415

Data columns

The addresses GeoParquet file contains the following properties:

Schema for the GeoParquet files in the addresses theme
Property NameTypeDescription
idstringA feature ID. This may be an ID associated with the Global Entity Reference System (GERS) if—and-only-if the feature represents an entity that is part of GERS.
geometryblobA WKB representation of the entity's geometry - a Point, Polygon, MultiPolygon, or LineString.
bboxarrayThe bounding box of an entity's geometry, represented with float values, in a xmin, xmax, ymin, ymax format.
countrystringISO 3166-1 alpha-2 country code of the country or country-like entity, that this address represents or belongs to.
postcodestringThe postcode for the address.
streetstringThe street name associated with this address. The street name can include the street "type" or street suffix, e.g., Main Street. Ideally this is fully spelled out and not abbreviated but we acknowledge that many address datasets abbreviate the street name so it is acceptable.
numberstringThe house number for this address. This field may not strictly be a number. Values such as "74B", "189 1/2", "208.5" are common as the number part of an address and they are not part of the "unit" of this address.
unitstringThe suite/unit/apartment/floor number.
address_levelsarrayThe administrative levels present in an address. The number of values in this list and their meaning is country-dependent. For example, in the United States we expect two values: the state and the municipality. In other countries there might be only one. Other countries could have three or more. The array is ordered with the highest levels first.
versionintegerVersion number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed.
sourcesarrayThe array of source information for the properties of a given feature, with each entry being a source object which lists the property in JSON Pointer notation and the dataset that specific value came from. All features must have a root level source which is the default source if a specific property's source is not specified.
filenamestringName of the S3 file being queried.
themestringName of the Overture theme being queried.
typestringName of the Overture feature type being queried.

Data access and retrieval

Overture's data themes, including addresses, are freely available on both Amazon S3 and Microsoft Azure Blob Storage at these locations:

ProviderLocation
Amazon S3s3://overturemaps-us-west-2/release/
Azure Blob Storagehttps://overturemapswestus2.blob.core.windows.net/release/

Overture distributes its datasets as GeoParquet, a column-oriented spatial data format that is a backwards-compatible extension of Apache Parquet. Parquet (and GeoParquet) is optimized for "cloud-native" queries, which means you can use many developer-friendly tools to efficiently fetch column "chunks" of cloud-hosted data. We encourage users who are new to GeoParquet to consult this guide.

The Getting Data section of this documentation offers instructions for using several tools to access Overture data, including DuckDB and Overture's Python command-line tool. See examples below for addresses.

We recommend querying and downloading only the Overture data you need. If you have a particular geographic area of interest, there are several options for using a simple bounding box to extract address data.

First, follow the setup guide for DuckDB.

DuckDB allows you to pass a bounding box in your query to select features in a specified geogrpahic area.

This example returns address data for Calgary, CA and the surrounding area:

LOAD spatial;
LOAD httpfs;
-- Access the data on AWS in this example
SET s3_region='us-west-2';

SELECT
*
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
WHERE
bbox.xmin > -114.305
AND bbox.xmax < -113.784
AND bbox.ymin > 50.854
AND bbox.ymax < 51.219;

Data manipulation and analysis

Using this query, you can get a count of addresses per country:

Query
LOAD spatial;
LOAD httpfs;
-- Access the data on AWS in this example
SET s3_region='us-west-2';

SELECT
count(*),
country
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
GROUP BY country;

This query will create a shapefile of address data in New Zealand, with limited attributes:

Query
LOAD spatial;
LOAD httpfs;
-- Access the data on AWS in this example
SET s3_region='us-west-2';

COPY (
SELECT
id,
number,
street,
unit,
postcode,
ST_GeomFromWkb(geometry) AS geometry
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
WHERE
country = 'NZ'
)
TO
'NZaddresses.shp'
WITH (
FORMAT GDAL,
DRIVER 'ESRI Shapefile',
SRS 'EPSG:4326'
);

This query will create a CSV file of address within the State of Utah, using the divisions theme data in a spatial query:

Query
INSTALL spatial;
LOAD spatial;

-- Access the data on AWS in this example
SET s3_region='us-west-2';

COPY (
-- Create a temp table with the state of Utah
WITH utah AS (
SELECT
id AS utah_id,
ST_GeomFromWKB(geometry) AS utah_geom
FROM
read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=divisions/type=division_area/*', filename=true, hive_partitioning=1)
WHERE
id = '085022383fffffff0167572d4665d6f9'
),

-- Use the geometry of Utah to filter addresses within the state's boundary
addresses AS (
SELECT
*,
ST_GeomFromWKB(geometry) AS geometry
FROM
read_parquet('s3://overturemaps-us-west-2/release/__OVERTURE_RELEASE/theme=addresses/type=*/*', filename=true, hive_partitioning=1)
INNER JOIN
utah
ON ST_WITHIN(ST_GeomFromWKB(geometry), utah.utah_geom)
WHERE
country = 'US'
)

-- Export the places selection to a CSV file
SELECT
id,
street,
number,
unit
FROM
addresses
)
TO
'utah_addresses.csv';

Revision history

Version info

You can find the most recent release notes here.

Support

Feedback

You can find a list of Overture repositories here.

Discussions are generally reserved for broader conversations around the addresses project as a whole (supporting a new workflow, adding a dataset, null attributes).

Issues are generally reserved for more specific concerns with specific entities in the dataset (geometry validation, missing entities, duplicate entities) or country-specific concerns.

Discussions

You can start and add to discussions in each of the public Overture repositories. Some examples:

Discussions around Overture's address data should be filed in the Data repository.

Issues

You can start and add to issues in each of the public Overture repositories, too. Some examples:

Issues around Overture's address data should be filed in the data repository.