Zombie Shapefile?

So I was walking to the Esri Dev Meetup in Arlington, VA with Kate Chapman and Adam Estrada and we were talking about SpatiaLite. Specifically, we were discussing James’ recent post and the lack of uptake of SpatiaLite. The really easy thing to do is to lay the blame at the feet of Esri for not supporting it. But then it occurred to me that Esri hasn’t succeeded in making the file geodatabase the next shapefile nor did they succeed in making the personal geodatabase the next shapefile. When I think about it, Esri has been trying really hard to kill the shapefile for a long time, without success.

null

A little historical perspective is in order. At the time ESRI published the shapefile format, they were under a lot of pressure from the community to open something. I recall most of the focus being on the ARC/INFO coverage, mainly because a lot of data resided in that format. Instead, they opened the shapefile, which had been introduced with ArcView 2.0. It took off like wildfire. Before long, every major competing platform could read and write shapefiles (heck, even Visio had a “maps” feature that could read it).

Keep in mind that most of us were dissatisifed with DBF files even back then. They were already long in the tooth. Also, the multiple-file structure of the shapefile was clunky from the get-go. The way I see it, the main reason for the success of the shapefile was the lack of anything else. The shapefile never had to present a “compelling reason to change” because there was nothing to change from. Sure, everyone still used their GIS-du-jour’s proprietary format (MapInfo TAB for example) but we could now pass around shapefiles freely.

The problem with any data format that seeks to succeed the shapefile is that it must first succeed the shapefile in the minds of users. Any subsequent pretender format must jump a hurdle that the shapefile simply did not. It must provide a compelling reason to change from the shapefile. As developers and system architects, we have plenty of those reasons (long column names, topology, better spatial reference support, object-oriented data, single-file structures, etc.). For whatever reason, those things have not resonated with the users. Even all of the compelling features of the geodatabase (which are well-summarized here), for those users committed to staying completely within the Esri framework, have not killed off the shapefile.

So I’m left to wonder if the shapefile is:

a) the brainless format that just won’t die; or
b) good enough?

  • OB

    I think you’ve missed one small fact out: While ESRI have given us the the personal geodatabase and the file geodatabase, which really should transcend shapefiles, they haven’t made the format interroperable (haven’t used that word for a while!) even between their own versions of ArcGIS. Maybe now newer versions of ArcGIS allow you to read different versions of personal/file geodatabase the take up will be greater, although i can’t help thinking they missed a trick here.

    I would still like to see SpatialLite become the ‘new shapefile’ if only because a format that is supported but not owned the GIS powerhouses sounds like a great prospect to me.

  • Crischan

    Well, the shapefile at least works across ArcGIS version borders… Can’t say the same thing for FGDBs – they are not backward compatible… No suprise it didn’t pick up.

    • And not only across ArcGIS version borders but also in MapInfo, GeoMedia, QGIS, uDig, WeoGeo, GeoCommons, etc., etc.

  • Pingback: Tweets that mention Zombie Shapefile? « geoMusings -- Topsy.com()

  • “Never trade the eight of diamonds for the nine of clubs.”

    The shapefile is good enough for now, because in the users’ minds (which I can read, obviously), all available alternatives are the nine of clubs.

    • I agree. I think it has a certain “at least you know what you’re gonna get” factor working in its favor.

  • Bill:

    It’s a question of audiences. Those who pine to rectify the short-comings of the shapefile are a small, small subset of those who are happy just to get the darn lines on their map.

    What about normal people? A public agency I work with recently got a request from a precinct captain (think “little old lady”) who requested political boundaries in KML “so I can look at it in Google Earth.” And the agency even had freely downloadable PDFs available…

    In short, there’s a sliding scale of ‘good enough’ and it’s not clear that a solution to a problem the unwashed masses don’t perceive as a problem is a great model for uptake.

    BT

    • “Those who pine to rectify the short-comings of the shapefile are a small, small subset of those who are happy just to get the darn lines on their map”

      [Free Square]

      “it’s not clear that a solution to a problem the unwashed masses don’t perceive as a problem is a great model for uptake”

      BINGO!

  • Nice post Bill! 😉
    I think that as file GDB won’t spread because it is a proprietary format, spatialite won’t spread because ESRI is not going to adopt it.
    Every possible solutions is between this 2 extreme, so basically it is on ESRI now to let us a better alternative to the zombie 😀

    • Thanks, Paolo. I completely agree about the file geodatabase, although I know few companies that are waiting for the mythical file geodatabase API (*cough*) so they can support the format.

      What I find more interesting about the FGDB is that, even just within the Esri “ecosystem,” it hasn’t supplanted the shapefile. The shapefile still remains a kind of lingua franca.

      But I also agree that some kind of support for SpatiaLite in ArcGIS, whether introduced by Esri or through some tool like zigGIS 🙂 , will be key to its wider use.

      But it’s not just format. I think Brian and Atanas make very good points about the mindset of the user base with regard to the shapefile. That’s a confidence issue that only time will solve.

  • I think I may be missing something here.

    Is there any particular reason (aside from simple malice) that we should all be chasing the shapefile around the mansion with a rope or candlestick? Certainly the format has its limitations, but how many of us just hang our heads and cry in our beer when we run up against those limitations? Aren’t there other tools at our disposal that we can (and do) use when the shapefile doesn’t cut the mustard?

    But why does this mean we have to throw the shapefile on the garbage heap? Should I throw away my iPod because my stereo has more features? If I get a cordless drill, should I toss out my screwdrivers?

    I use notepad all the time. Should I stop doing this because WordPad has more features? Or should I skip right on up to a full Office suite? Am I allowed to use my hammer to hang a picture? Or do I have to run the cords and hoses for the compressor and the nail gun?

    The shapefile is a tool. Despite its limitations, it’s a damn useful tool (as evidenced by its continued widespread use). So why can’t we keep using it? Whatever happened to “the right tool for the right job”?

    Why must a better tool murder the shapefile in order to establish its superiority? Is there some sort of Geo-Gladatorial Arena I’m unaware of?

    • Terry,

      It looks like you’re onto me. Whatever things make all of those other formats “better” in the minds of the people who label such things (myself included), has not equated to “better” in the minds of people who are regularly choosing the shapefile.

      With a shapefile, a user doesn’t need to know:

      – where the server is
      – what his/her username and password are
      – where the MDB (or SQLite) file is
      – what feature dataset his/her layer is in
      – what schema holds his/her layer
      – etc.

      With a shapefile, a user simply needs to know where he/she left it on the file system (or wants to put it if creating it), then they point GIS-du-jour at it and start working. This ease-of-use seems to consistently outweigh all of the technical limitations of the shapefile in the minds of users.

      If the world needs a “better” format, it will need to approach that level of ease-of-use before we see the shapefile start to fade away (if it even needs to).

  • Google offers JPEG alternative for faster Web

    “Google plans to announce the new WebP graphics format today along with its research that indicates its use could cut image file sizes by 40 percent compared to today’s dominant JPEG file format. That translates to faster file transfers and lower network burden if Google can convince people to adopt WebP.”

    Here is an interesting tangent on the formats front. Will Google succeed in supplanting the ubiquitous JPEG format? Is 40% shrinkage in file size a compelling enough reason?

    http://news.cnet.com/8301-30685_3-20018146-264.html

    • Maybe. I’d go for it. Then again, with regard to the shapefile, a lot of the reasons I listed in my post make sense to me. And yet…

  • Jacob Blair

    Bill, I was promised by an ESRI employee at a seminar last week that the FGDB API will be out by the end of the year, but I wouldn’t hold my breath, either.

    My biggest beef with the “new shapefile” is that everyone is trying to leverage the DBMS concept for the next(current?)-gen file formats. I get it: unified tools, philosophy, structure, blah blah blah. The problem is that they generally all disrespect the file-object ecosystem of the OS and network. How can you mime type multiple files or a directory? I know the shapefile and FGDB have reasons for their structure, but so what? One file to rule them all; preferably fast and simple.

  • Let’s be clear. This post is not about any one format failing to knock off the shapefile. It’s about the continued success of the shapefile. Get to the bottom of why that happens, and you’ll have the beginning of a blueprint for how to build a “better” shapefile.

    • As you know, I am an archaeologist. I remember a day in the field, not too terribly long ago, watching my boss (Tim) taking photographs of a feature (common practice – archaeology is mad about taking records). I watched Tim take a few frames with a camera loaded with color slide film (that’s right – slide film), then with a camera loaded with black and white print film. He sized up the photos he wanted, then moved closer to the feature in order to achieve the shot he wanted. This was a behavior I had seen numerous times, and it was born of thousands (if not tens of thousands) of similar experiences Tim had had in the past.

      Then Tim pulled out the agency’s brand-spanking-new digital camera, and I watched with some amusement as he precisely repeated the rituals he had performed so many times with the old 32mm cameras. When he had finished, I felt compelled to speak up.

      “Tim,” I said. “You know you can control the zoom on that camera by pushing a button, right?”

      Tim looked at me, at the camera, back at me. “Yup,” he said.

      “Then why,” I asked, “Do you put yourself through all those contortions? Why not just push the buttons to get the picture you want?”

      Tim looked at me, at the camera, back at me. He shrugged. “I like to do things the way I’m used to doing things,” he explained.

      And that, Bill, is my ‘Tim Theory’ about the continued success of the shapefile. In the end, most people pretty much want things to stay the way they are.

      If we really want to build a better shapefile (and I know I do), I don’t think the process should start by turning our backs on the shapefile. Like you’ve been saying – we should be spending our time examining why the shapefile is so damn persistent. At the end of the day, there is something about the shapefile that keeps us all coming back. And rather than turn our backs on it, maybe we should embrace it.

      Tim has since learned to use a digital camera. Mainly because it’s pretty much the same as the cameras he was used to, with some cool and needed added features. I think we can all learn something from Tim.

      Maybe we should stop trying to reinvent the wheel and instead focus our energy on improving it.

    • I completely agree. As a developer, I am saying that maybe we need to stop and listen to the archaeologists a little more. In another comment, Brian called this shapefile question a problem that’s not perceived as a problem (I paraphrase).

      This discussion can really be extended to any of the latest bells and whistles, but “improving” the shapefile is the most concrete example. Your Tim example is another good illustration.

      So I’m not advocating for or against the shapefile itself. I’m advocating paying a little more attention to the people to actually use shapefiles.

      Thank you for your comment. It is spot on.