Skip to main content

Global Entity Reference System

Overview of GERS

Overture's Global Entity Reference System (GERS) is a framework for structuring, encoding and matching map data to a shared universal reference. Overture is building this common ID system to solve the difficult problem of integrating geospatial datasets; to make it easier to share and sell map data; and to enable cooperation among companies, non-profits, government agencies and other organizations.

GERS provides a mechanism to conflate datasets, matching one or more features via a unique GERS ID. For example, two geospatial datasets each containing a polygon that represents the footprint of the Empire State Building can be easily matched because both polygons will contain the same GERS ID. The resulting dataset will have a single polygon feature, with one GERS ID, that represents the real-world entity known as "New York's Empire State Building."

What Is An Entity?

Overture defines an entity as a physical thing or concept in the real world, represented as a feature in a geospatial dataset. It could be a segment of road, a city boundary, a grocery store, a building or a park. Overture assigns unique IDs, called GERS IDs, to these features. In most cases it's helpful to think of an entity and a feature as the same thing, but in practice it can be more complicated. An entity in the world could be represented by multiple features in a dataset, and a feature in a dataset might be a representation of multiple entities in the world. For example, a school building and its entrances and exits might be considered a single entity in the world but could be represented as multiple features in a dataset, each feature with its own GERS ID.

GERS ID

GERS IDs have the following structure:

H3 index + data theme identifier + 56-bit value dependent on theme

Here is an example query that outputs a short list of GERS IDs from the buildings theme in Overture's January 2024 data release.

D SET s3_region='us-west-2';
D SELECT id FROM read_parquet('s3://overturemaps-us-west-2/release/2024-01-17-alpha.0/theme=buildings/type=*/*', filename=true, hive_partitioning=1) limit 5;
┌──────────────────────────────────┐
│ id │
varchar
├──────────────────────────────────┤
08ba0085884a4fff0200ab2e3ad3f37c │
08ba00858a164fff020008cf146d28e6 │
08ba00858a16cfff0200271c60f0a5ea │
08ba00858a161fff02008633c8f27bf7 │
08ba00858a163fff0200bf648279e290 │
├──────────────────────────────────┤
5 rows
└──────────────────────────────────┘

Generally, Overture discourages users from attempting to glean information about a feature from its GERS ID. The ID should be treated as a unique identifier and nothing more. It is worth going into more detail about Overture's use of the hexagonal hierarchical geospatial indexing system (H3) in GERS. H3 is used primarily to prevent hashing conflicts among features with similar attributes. The H3 index component of a GERS ID reflects a feature's location in the H3 grid system when Overture first assigns it an ID. Different data themes use different H3 resolutions. Because GERS IDs are intended to be stable over time, Overture will not generate a new ID for a feature if its moves within the H3 cell it was originally assigned.

  • a single road segment is bisected by a new road and becomes two road segments: 1 GERS ID → 2 new GERS IDs
  • one large building footprint on the map is determined to be four smaller buildings when a higher resolution dataset becomes available: 1 GERS ID → 4 new GERS IDs
  • a building is shifted 10m west when higher resolution imagery is made available: GERS ID is preserved for that feature

Stability and Traceability

GERS IDs are intended to be stable (within a reasonable tolerance of error) across multiple versions of Overture data. When stability is not possible, Overture will attempt to provide traceability. For example:

  • a single road segment is bisected by a new road and becomes two road segments: 1 GERS ID → 2 new GERS IDs
  • a large building footprint is determined to be four smaller buildings when a higher resolution dataset becomes available: 1 GERS ID → 4 new GERS IDs
  • a building is shifted 10 meters to the west when higher resolution imagery is made available: GERS ID is preserved for that feature

Using GERS

Currently, Overture's partners are using GERS in two ways:

  1. Contributing data. The Overture engineering team generates new GERS IDs or matches existing IDs for contributed datasets as they relate to current Overture datasets. The contributed data becomes part of the Overture corpus.
  2. Associating data. Users independently conflate their own data with a current Overture dataset, identifying matches among features in both datasets. Only the features in the associated dataset that match an existing feature in the Overture data corpus can be assigned a GERS ID. The associated data will not become part of the Overture corpus, but it will become GERS-enabled data, ready to be matched to any of the available datasets in the Overture data ecosystem.

See the scenarios section of this documentation for more detailed examples of GERS use cases. Feedback on GERS is welcome on GitHub.

Licensing

Overture's data themes and schema are licensed under CC-BY 4.0. GERS IDs are licensed under CDLA Permissive v 2.0. Data that are associated with the themes through GERS can carry different licenses. However, if the data associated though GERS with an ODbL licensed theme constitutes a derivative dataset, it must be licensed under ODbL. See OpenStreetMap Community Guidelines for definitions of a derivative database.