Attribute Transfer in PostGIS Using Spatial SQL

Data conflation is a meat-and-potatoes task in most GIS workflows. There are numerous reasons one might need to get data from one data set into another. Perhaps you want to attach a new geometry type to existing attributes or a table. Or maybe you need to pull attributes from one or more data sets into a single, “master” version. I have seen this latter use case referred to as “attribute transfer.” In an interactive, desktop setting, this can be tedious, but it’s a task at which spatial SQL excels.

Here is a simple example that uses just one line of spatial SQL (or two lines if you need to add the column) to do the heavy lifting. First, some table setting. This example takes the number of confirmed COVID-19 cases from the Johns Hopkins university county-level data (a point data set) and transfers it to a polygon data set of the US counties. There’s one caveat at the end of this post.

--Adds the column if it's not there. This syntax requires PostgreSQL 9.6 or later
ALTER TABLE public.us_counties ADD COLUMN IF NOT EXISTS confirmed integer;

--Updates polygon layer using spatial query to transfer from the point contained by the polygon
--Use the actual polygon, not the bounding box. Bounding box may return incorrect point(s).
UPDATE us_counties AS c 
SET confirmed = j.confirmed
FROM jhu_county AS j
WHERE ST_Contains(c.shape, j.wkb_geometry); --polygon contains point, assumes same coordinate system

You can check the results with the following:

SELECT c.confirmed AS poly_confirmed,  j.confirmed AS point_confirmed FROM us_counties c JOIN jhu_county j ON ST_Contains(c.shape, j.wkb_geometry);

You should see something like this, where the values in both tables are the same.


There’s one caveat to all of this: you don’t need to do it this way.

The idea of physically transferring data from one data set to another is a very GIS way of thinking about this problem. If we were working with some file-based data format, then we might need to physically transfer the data. In a database, we can do this logically by creating a view and letting the data physically reside in its own tables. The leading desktop GIS software tools all recognize database views and can map them, so you should explore that approach as well.

Note: For simplicity, this example uses raw numbers. This particular data should be normalized by population for subsequent analysis. SQL and views are also ideal for that task.