Cville Releases Master Address Point Dataset

Master Address Points

This past October, the City of Charlottesville released the Master Address Points dataset. This may be the most significant dataset released to the public since the start of the Charlottesville Open Data Portal and will be an invaluable resource to any users of city data.

The importance of address information cannot be underestimated. According to the Federal Geographic Data Committee, “Street addresses are the location identifiers most widely used by state and local government and the public. Street addresses are critical information for administrative, emergency response, research, marketing, mapping, GIS, routing and navigation, and many other purposes.” Going even further, Charles Prescott says that “Without an address, you do not exist.”

The Charlottesville Open Data Portal notes that the Master Address Table, which has been available since May 2017, is a “comprehensive set of standardized addresses for the City of Charlottesville” listing detailed, authoritative address information. While the table provides valuable information to “municipalities, residences, businesses, and application developers,” it lacked information the latitude and longitude information necessary to analyze and map the addresses.

Master Address Points is a geospatial version of the Master Address Table. The Master Address Points dataset allows the user to relate data to other Charlottesville open data datasets using both the address as a key, i.e., a field with common values used to link datasets, or the address point location. Use of the address point location greatly increases the power of the dataset. By having the location, the Master Address Points dataset can be combined with other spatial datasets in spatial joins to find all addresses within an area, like a neighborhood, or a specified distance from other features, such as fire stations, schools, or rivers. In addition, the location can be used to identify the closest address to any other feature, such as a police station.

The Master Address Table was “geocoded” to obtain the address locations and display them on a map. Geocoding is the process of converting addresses or other text information to latitude and longitude coordinates that can be overlaid on a map.

In the past, users of city data had to geocode the data themselves. Geocoding is typically an imperfect process. There are many geocoders that can perform this process, but they use different assumptions and reference data, leading to different results. As users might select different geocoders, they would obtain different results for the same address. Having the City of Charlottesville geocode the data in the Master Address Points data provides a single, consistent, authoritative source for the locations of addresses.

Things to Consider

Overall, the Master Address Points dataset has very high quality, especially in neighborhoods with single-family homes, where the points are placed in the front of the structure on the street side.

figure 2

Most of the addresses fall into this category. A recent analysis of the distance between the Master Address Points and the Existing Structure Area datasets showed that 50% of the address points were inside a structure, 75% of the points were within 7 feet of a structure, and 95% of the points were within 34 feet of the structure. The largest distance of an address point from a structure was approximately 260 feet.

While the single-family home is the optimum scenario for geocoding, there are other situations that are not as straightforward. Geocoding in commercial and industrial areas, where the relationships between the buildings and the streets vary, will often locate the address points away from the associated buildings.

Figure 3

These situations, as well as the two situations listed below, reveal that the process requires an understanding of the geocoding results to avoid misinterpreting the data.

Multiple Points at Same Location

The geocoding of apartments and condominiums is especially challenging, as individual building can have multiple units with the same basic street address, but differing unit identifiers. For example, the Standard at Charlottesville is located on 853 West Main Street and has 220 different units, i.e., #139, #140, #141, etc. Current geocoders lack the ability to distinguish the locations of apartments or units within a single building, within a single floor, or across multiple floors, so all units are all mapped to the same location.

When we map the address points as points, we see a single point on the building. This is somewhat misleading as it hides the fact that there are actually 220 separate points at the same location. (Figure 4A) One way to see if there are multiple addresses at the same location is to aggregate and count all the addresses at a single location and plot the counts as symbols which are sized proportionally to the counts. (Figure 4B).

Figure 4

Map makers need to take great care when interpreting geocoded maps where a single symbol, such as a dot, is shown when a feature has multiple units. If the map shows different characteristics of the unit by color, only the color for last unit plotted will be displayed … all the other colors will be hidden beneath that symbol. For example, we can map the number of bedrooms in a unit using black for one bedroom, gray for two bedrooms, and white for three or more bedrooms. While the number of bedrooms can vary within the building; when mapped, if more than one unit is mapped, you would only see the color for the last unit plotted.

Data Synchonization Issues

Data synchronization issues should also be considered when using the geocoded data. When the data was checked in early January, points for some buildings in the Existing Structure Area dataset were missing. Users would see buildings, but no associated address points (See Figure 5) The data was recently updated and the addresses are now geocoded.

Figure 5

This type of error may occur if the datasets are not synchronized. When checked on February 9, 2020, the Master Address Table had 24,191 entries, while the Master Address Points dataset had 24,119 records, a difference 72 addresses. Information in the portal stated that the Master Address Table was updated on March 25, 2019. The Master Address Points dataset was last updated on either October 4, 2019, according to the metadata; or February 9, 2020, according to the dataset overview page. The bottom line is that users should pay close attention to the update cycle and latest update of the Master Address Table, Master Address Points, and Existing Structure Area datasets when using the data.

Attribute Differences

As of February 9, 2020, two key fields from the Master Address Table were missing from the Master Address Point file: the Building Identification Number and the Zip Code. These two fields would be helpful to users of the Master Address Point file, as they are geospatial identifiers or keys that can be used to link to other data sets, like the Existing Structure Area file, which includes a Building Identification Number; or census or postal data which includes a Zip Code. It is possible to access the Building Identification Number and Zip Code by joining the Master Address Point and Master Address Table, but this requires knowledge of manipulating datasets and is unnecessary extra work for the user.

In addition, there are some fields that are typical for geocoding software results that would be helpful to the user of in evaluating the data. These include the name of the address locator, the status of match, the type of match, and the match score.


The City of Charlottesville’s addition of a Master Address Point spatial dataset is a valuable contribution to Charlottesville’s open data. Addresses are our most important way of locating people and businesses. The overall quality of the geocoding is very good, and the dataset provides a single authoritative source for the locations of addresses. This allows the address to not only serve as a key for linking datasets, but a location that is suitable for analysis with other geospatial datasets. In order to prevent misleading analyses and deceptive maps, the analyst should be aware of issues with imperfect geocoding in certain situations, multiple addresses at a single point, and the currency of the data.

More Information

