Refining the Meaning of "Location" Across the United States Through Building-Based Geocoding
Gaining a solid understanding of location is a prerequisite for many decision-making applications - this all starts with geocoding. Learn how Ecopia achieved the gold standard of Building-Based Geocoding.
An understanding of the coordinates related to an address (the “geocode”) is critical for many decision-making applications. This is particularly true for organizations seeking to make remote assessments of properties (i.e. insurance risk assessment, broadband network planning, real estate property analytics), or improve the efficiency of their operations (i.e. field maintenance, last-mile navigation). However, just as most consumers have had the experience of entering an address into their GPS and being led to the wrong location – organizations who rely on location data for decision-making experience this challenge in an amplified manner at-scale.
This blog post will focus on the challenges of creating a national geocoding solution (linking each address to the correct building across the country), and how Ecopia AI has overcome these challenges to create and maintain the preeminent building-based geocoding solution for the United States.
The gold-standard for geocoding is the ability to enter an address and receive the coordinates of the building (or, where applicable, part of the building) related to that address – this is described as “building-based geocoding”, as seen in Figure 1(a). Historically, this was not possible to achieve at a national scale because there was no comprehensive map of buildings across the United States. As an alternative, over the years, organizations have relied on various proxies as best guesses of the coordinates related to addresses across the US – most commonly based on street-segment or parcel centroids. Figure 1(b) shows examples of these two methods – the red oval highlights street-segment geocoding, and the orange oval highlights parcel centroid geocoding.
- Street-segments: Geocodes are distributed evenly across the length of the road based on the address numbers related to that street (ie. geocode for address 50 is placed in the middle of a road segment that has addresses 1 to 100). Street-segment geocoding was first commonly used for vehicle navigation - getting the driver close to the location of interest, and relying on their judgement to independently navigate the last mile. This method often results in coordinates that are on the wrong parcel or down the street.
- Parcel centroids: Geocodes are placed in the center of the parcel related to each address. With the digitization of parcel data, this method became popular as a proxy for building location and was widely adopted as a standard for geocoding over the past decade.
While many geocoding vendors claim to have “rooftop” or “building-based” geocoding, many of these databases still rely on parcel-based or street segment information, and have not been validated against real-world information (such as a comprehensive map of buildings). By leveraging Ecopia’s building footprints (the most comprehensive map of buildings in the United States), we were able to confirm that many third-party geocodes fall on the street, the wrong building, or in the wrong parcel.
Over the past months, thorough testing was conducted comparing third-party “industry-leading” geocoding services to those of Ecopia. The test results illustrated shortcomings of third-party claims surrounding “rooftop” or “building-based” accuracy which, as revealed by our tests, are often actually based on parcel-centroids or even street segments.
average accuracy of third-party geocoders
accuracy of Ecopia's geocoder
These geocoding inaccuracies result in categorical underwriting errors. For example, Figure 2(a) shows how a third-party parcel centroid geocoder could result in a “safe” flood zone determination, while the actual buildings on at those addresses are well within the flood zone – resulting in underpricing. Conversely, Figure 2(b) shows how a third-party parcel centroid geocoder could result in an “unsafe” flood zone determination while the actual building is well outside of the flood zone – resulting in overpricing.
Unfortunately, this is not an isolated problem. Ecopia conducted a country-wide analysis comparing the flood risk rating that would be produced by a third-party parcel centroid geocode relative to that produced by a building-based geocode from Ecopia. The results were stark: over 1,000,000 buildings across the US would be underpriced for flood risk due to parcel-based geocoding, and an additional 600,000+ properties would be overpriced. When extrapolating based on the average flood claim reported by FEMA over the past 10 years, the underpricing presents up to $43B of risk that is unaccounted for. Figure 3 shows a heat map of where this risk resides across the USA. For carriers, this could translate into millions or even billions in potential claims across their portfolios – due in part to inaccurate geocoding data. Below we will talk about the challenges surrounding high-precision building-based geocoding, and how Ecopia is able to offer a uniquely accurate solution across the US.
Challenges of high-precision building-based geocoding
The process of creating high-precision building-based geocoding requires overcoming the following three substantial challenges:
- Sourcing a comprehensive database of building footprints: A high-precision building-based geocoding solution cannot be created without a foundation of comprehensive, accurate, and up-to-date building footprints. Unfortunately, many third-party building footprint databases are not suitable for this purpose due to a combination of data collection methods such as scraping fragmented open data, or using naïve AI algorithms to mine open-source imagery without any quality assurance. These methods result in building footprints that are incomplete, of inconsistent accuracy, and often out-of-date. Figure 4(a) below shows an example comparing Ecopia’s building footprints (green) to building footprints from a third-party provider (red) leveraging open source imagery (outdated, lower-resolution), inhibiting their ability to offer comprehensive coverage.
- Linking addresses and footprints: Assuming a reliable foundation of building footprints can be sourced, there are then many complex address-to-building linking relationships that need to be properly handled to ensure that the right addresses are assigned to the right building across the country. The address details first need to be parsed correctly (i.e. street name and number, city, state, and zip code), and then those addresses must be associated to the correct building footprints. Two challenging examples are highlighted in Figure 4(b) below are: when there are many buildings and many addresses within one parcel (highlighted by the blue circle), or when one large building spans multiple parcels and each parcel contains a different address (highlighted by the orange circle).
- Updating the data: Maintaining high-precision building-based geocoding first requires the ability to regularly update building footprints to capture new construction, demolition, and modified buildings at a national scale. Most third-party vendors cannot offer stable updating because the building footprints were either: scraped from municipal sources (reliant on unstable budgeting, image capture, and production cycles of each municipality); or mined from open-source imagery (reliant on image providers who predominantly offer outdated reduced-quality imagery to minimize revenue cannibalization risk). Figure 4(c) shows an example comparing Ecopia’s building footprints (green) to out-of-date building footprints (red) from a third-party provider that leverages openly available imagery.
In addition to updated building footprints, the most up-to-date addresses must be sourced to accurately reflect any changes in addresses from quarter-to-quarter. These addresses must be accurately matched against the previous database, which is non-trivial when considering hundreds of millions of addresses including alias addresses (alternative address names for a primary address). The validity of each address change then needs to be carefully evaluated against factors such as building footprints and other context to determine whether to include it as a new address, update an existing address, or remove the address from the database.
The solution: Ecopia's Building-Based Geocoding - the most comprehensive, accurate, up-to-date building-based geocoding solution for the United States
Ecopia has nearly a decade of experience leveraging artificial intelligence and geospatial imagery to create and maintain geospatial data products for use by governments and large corporations around the world. Based on this expertise, we embarked on a journey to create the most comprehensive, accurate, and up-to-date building-based geocoding solution for the United States (resulting in the Ecopia product: “Building-Based Geocoding”). Specifically, Ecopia overcame the previously mentioned three challenges as follows:
- Ecopia created & maintains the most comprehensive, accurate and up-to-date building footprints as a foundational layer: Ecopia's Building-Based Geocoding is built on the foundation of our proprietary building footprint database comprised of 176M+ building footprints - including every structure greater than 100 square feet across 100% of the contiguous United States. This dataset was generated directly from high-resolution aerial and satellite imagery, is updated annually, and comes with contractually guaranteed 95%+ accuracy specifications. For more information on how Ecopia created the most comprehensive map of building footprints in the USA, please see our recent blog post here.
- Ecopia’s proprietary geocoding engine is leveraged to assign the right address to the right building at a country-scale: To tackle the complex task of assigning addresses to the right building footprints at a country-scale, Ecopia built a proprietary geocoding engine specifically for this task. This proprietary process uses a unique machine-learning based address parsing system to match each address to the correct building at-scale. This process allows the effective matching over 270M+ addresses (primary + secondary) to 176M+ building footprints across the US – resulting in the most comprehensive building-based geocoding dataset available today.
- Ecopia’s partner network and AI systems are leveraged to update the database annually: Lastly, Ecopia leverages partnerships with leading geospatial imagery providers to source fresh high-resolution geospatial imagery of the US each year. This imagery is mined with Ecopia’s AI-based systems, outputting an annual update to the building footprints. These updated building footprints are then further enhanced with the most up-to-date address data available through the use of Ecopia’s proprietary geocoding engine.
Figure 5 provides an illustrative overview of the process of creating Ecopia’s Building-Based Geocoding product.
Gaining a solid understanding of location is a prerequisite for many decision-making applications – this all starts with geocoding. We’ve learned that creating and maintaining a national building-based geocoding solution is a non-trivial task - and that true “Building-based” accuracy cannot be achieved without a comprehensive, accurate, up-to-date map of Building Footprints.
Reach out to our team to learn more about our preeminent Building-Based Geocoding solution for the United States, and how it’s being swiftly adopted across industries such as insurance and telecommunications.
Learn more about Building-Based Geocoding
Ready to get started?
If you're ready to leverage groundbreaking advancements in artificial intelligence, let's chat.