Data Over Software

One of the first tasks I ever had in my then-new GIS career was doing AML development in ARC/INFO 6.x for a data production project. My code parsed DXF exported from AutoCAD R11 for DOS and then assigned attributes based on things like layer, color, line weight, feature type, and others. It also georeferenced the data based on tic marks captured in AutoCAD. The end result was multiple ARC/INFO coverages that were fully populated from data templates based on the AutoCAD characteristics. From there, QA analysts tailored the data from defaults, if necessary.

After that, I did a lot of work in AML to build a cartographic production system for a water utility. That had me building a GUI using ARC/INFO forms and developing customized editing tools with ArcEdit in ArcPlot mode.

As you can imagine, I dug deeply into AML. I learned a lot about GIS – in which I had no formal training. Because AML essentially batched the same commands the analysts used at the command line, all of this development made me quite proficient with ARC/INFO. Those were fun times. Because I needed to learn GIS, this period had a lot of value for me.

As a software developer, however, there was a big drawback that is evident in the full name of AML – Arc Macro Language. All of the time and effort I was investing into building proficiency in AML was usable in exactly one place. The same was true when ArcView came along with its proprietary object-oriented language, Avenue.

AML itself wasn’t a problem. It was certainly more accessible than C or Fortran for daily processing tasks. But, as a recent graduate who’d been working in C, C++, Pascal, and LISP  among others, the more time I spent working in AML or Avenue, the farther I felt I was getting away from my core strengths. (I did get to use LISP in AutoCAD.)

This eventually worked itself out for me as the Esri ecosystem gravitated towards standard programming languages and then as I worked more deeply with open-source tools.

All of that serves as a long-winded introduction to some thoughts I had about current-generation low-code solutions. I’ve recently been working with Power Automate to implement a data flow for a customer. Power Automate is itself a low-code solution and I was using it to extract data from another proprietary low-code tool and push that data to PostgreSQL.

In the process, I did a lot of research to figure out which connectors to use, how to configure them, how to query the data I needed and route the results to the necessary connector for output. Then I worked on formatting the output’s schema, transforming the content, and writing out a dynamically-named file.

All of this research led me to some very familiar places: Google, Stack Overflow, Microsoft user fora, a little Reddit and so on. Pretty much all the same places I find myself visiting to learn how to do something new in a programming language like Python.

Power Automate does indeed seem powerful and, if I spent enough time to build proficiency in it, I could probably be very productive with it. But that productivity would be limited to the Microsoft ecosystem and any of its partner tools. I find myself wondering why I would spend that time learning Power Automate rather than a cloud-oriented language like Go or Rust, which would have far greater utility. 

Like AML, low-code tools aren’t inherently bad. I recognize that many people who have responsibility for customizing solutions are not programmers and don’t want to be. They are the audience for low-code tools, not me. I understand that.

Modern low-code tools are usually positioned as SaaS tools. Some SaaS tools respect that the data they contain is yours and make it easy for you to get your data out via APIs that allow you to use the tools of your choice. Others rely on “connectors” with partner platforms that exist to provide highly-curated paths for data between proscribed systems. They are essentially tunnels that allow data to pass between walled gardens. That was the experience I had with Power Automate and, based on a lot of work I’ve done previously, it’s a fairly common (but not universal) pattern across the low-code/SaaS platform space. (The platform I worked with in my previous role is an exception to this as it provides a robust API and export tools to ensure that users can easily access their data.)

My recent experience with Power Automate gave me some AML flashbacks, but more from an organizational perspective. In Power Automate, like many other low-code workflow tools, you can implement fairly complex logic and data flows. Those data flows, and the business knowledge they represent, are not really portable into any other platform, unlike logic that is encoded in a standard programming language.

Because low-code SaaS tools do genuinely bring value, they should be considered seriously in your overall data and application strategy. I used to spend a lot of time reviewing SOC2 reports to assess information security practices of platforms. Most do a pretty good job and exceptions tend to be reasonable. As a general rule, any platform that has taken the time to get a SOC2 assessment has already put a lot of work into security. My advice is to avoid any that don’t have a valid SOC2 assessment (or ISO 27001). That’s usually a good indicator of maturity.

Beyond security, however, you’ll want to consider barriers to exit. It’s possible that you may want to leave the platform and take your data with you but, more likely, you’ll simply want to pull your data out to use elsewhere – maybe your BI system, or ERP, or simply to archive it in a location that you control. These kinds of things will be less apparent in a SOC2 report and will require thorough probing on your part.

Data integration is almost always more reliable than API integration. APIs can be fragile and unavailable. Versions change, which can affect any of the blessed “connectors” you may be using in other platforms, including third-party “IPaaS” platforms that give the illusion of interoperability but are simply lock-in by proxy.

I always rated more highly platforms that provided a simple API that allowed me to extract my data and put it into a data warehouse that I could then use for analytics. While that upstream API can still be fragile or unavailable, I’m only dependent on it while I am pulling data, rather than expecting it to always be available for a daisy chain of other APIs from other systems. When I’m simply pulling data, I can build in some smart retry logic for times when the API hiccups.

So, much like the old proprietary macro languages of workstation-based software, low-code SaaS platforms require a lot of time and effort to build proficiency which, in the end, is mostly only applicable to those systems and not very portable. That’s not a bad thing if the platforms bring a lot of compelling value, which many do. For example, I am a huge fan of FME for designing ETL flows.

But, recognizing that such platforms come with some inherent lock-in, you can mitigate some risk by opting for those with APIs that are generally more standards-based and generally more open. All of the code we write and all of the software platforms we use exist to manipulate our data, which is the raw material of software. If we elevate data over software and keep our data at the center of our decision-making, we can defray a lot of the risk that comes with the convenience of modern software platforms.