Using Dynamic Non-Spatial Data In GeoCommons

In my previous post, I described how I used a Python script to scrape power outage information from a local web site and convert it into an RSS feed. In this post, I’ll show how I used GeoCommons to visualize the changing information over time.

The process starts by creating a data set in GeoCommmons based on a URL link to the feed created in the previous post. The general process for doing that can be found here in the GeoCommons documentation.

My feed is not a GeoRSS feed so it has no location data of its own for GeoCommons to work with. During the upload process, I reached this screen, which starts the process of helping to attach location to my data.

The feed summarizes power outage by ZIP code so I chose “Join with a boundary dataset” so that I could join it with ZIP code boundaries I had previously uploaded.

I selected the attribute in my feed (title) that was to be used to join with a corresponding attribute in the boundary data set (Zip) as shown below.

You’ll notice that the success message indicates three features were matched. This is true for this version of the feed because ZIP codes with zero power outages are not reported. The join, however, updates itself as the feed updates so more or less polygons may appear in the current version, depending upon feed content.

After reviewing my data and providing some basic metadata, GeoCommons performed the join and my data set was ready to go.

In the image above, you’ll notice a link labeled “fetch latest.” That link, which is formatted as “http://geocommons.com/overlays/{overlayid}/fetch,” can be used to manually get the latest version of the feed, which is stored by GeoCommons. Essentially, GeoCommons stores the state of each feature in the data set as the feed is fetched so you build a “version history” your data. As long as you have a date/time attribute, you can use GeoCommons to visualize the changes over time.

In addition to the Python code from previous post, I also used a variant on the script found at http://www.voidspace.org.uk/python/articles/authentication.shtml. The fetching capability requires authentication so I modified the script to call the “fetch” URL using my GeoCommons user name and password. The script may be overkill but work perfectly without any changes.

On the server, I wrote a four-line batch file to act as a driver for the whole process. This batch file is what is called by a scheduled task in Windows.

[sourcecode language=”powershell”]
del *.xml
del *.pickle
python SmecoFeedObj.py
python fetchlatest.py
[/sourcecode]

As you can see, the batch is very simple. It deletes the old files, scrapes the latest version and writes new files (SmecoFeedObj.py) and then updates the GeoCommons data set (fetchlatest.py).

The server is a Windows server so I set up a scheduled task (How to: XP, Vista, Windows 7, Server 2003, Server 2008). I set my task up to run once an hour so the latest data is scraped and pushed to GeoCommons hourly.

With the data set now created and being updated, it can be used to make maps in GeoCommons to visualize the changing data. I created two maps to demonstrate this. The first, using a filter, allows a user to filter the feed data to a time window of their choosing and map just the outage data for that time window.

The second map, shown below, uses GeoCommons animation capability to allow a user to “play through” the data based upon the publication date/time. A user can either drag the time slider manually or let it play automatically. They can also adjust the width of the slider to narrow/widen the time window. I’ve been told by GeoIQ that animation is under active improvement so I’m interested to see how it evolves. This was my first attempt at using it with my own data so I’m sure I’m not using it optimally. That said, I’m impressed with how easy it was to set up a time-based animation.

GeoCommons map animating power outage data

All-in-all, it took me about 4 hours to go from data embedded in an HTML page to a working map animation. That really speaks to the power of the tools available today, from programming languages like Python and open standards like RSS to online tools like GeoCommons, as well as a host of others I didn’t use for this work. It is becoming easier all the time to integrate and use spatial tools to exploit data from traditionally non-spatial sources and share the results widely. As traditional “GIS” fades into the background, the resulting fusion of more standard technologies is opening a wider world of possibilities.