I flew home from St. Louis yesterday, returning from the FOSS4G North America conference. I took a different approach to the conference this year than I did in 2023. Rather than moderate sessions and participate in a lot of on-site logistics, I sat in on more sessions so I got to see a lot more content and learn from the experts who came to share their experiences.
The three general components of any system are people, process, and technology. I’ll address my FOSS4GNA experiences within that framework, but I’ll offer some general observations first. I was on the organizing committee – primarily on steering and sponsorships. I did not participate in program selection at all this year. I thought the program was technically quite strong, especially in terms of AI and related technologies. The volunteer team worked the event flawlessly. I also really appreciated the enthusiasm of the sponsors.
The city of St. Louis was a great choice for location. The commitment of the community to growing a geospatial innovation hub is clear. It is also clear that there is a long way to go to achieve that goal. That’s okay. I have written elsewhere about my belief in the value of a geographically diverse tech workforce. The acceleration of remote work during the pandemic gave this a bit of a boost by allowing tech workers to move to smaller cities and towns, but investment by those localities is key to making the tech diaspora stick.
St. Louis sees this and is the kind of city with “good bones” that can elevate its importance in the late information age through investment. But cities like St. Louis are in a race against time to outrun return-to-office mandates that threaten to restart the engines of the domestic extraction economy that has disproportionately pulled tech talent into the orbits of San Francisco, New York, and DC. If the presence of events like FOSS4GNA and the USGIF’s GEOINT Symposium can help St. Louis move a little closer to its goals, then I’m happy we could play a part.
With all of that out of the way, I’ll talk about people, process, and technology – but not in that order.
Process
The farther I progress in my career, the more interested I am in the sustainability of open-source software, both at the macro level and at the micro level. At the macro level, that looks like sustainable resourcing of open-source projects and the alignment of the capabilities of the software to policies and problems to ensure that the projects stay relevant. At the micro level, that looks like best practices for integrating open-source into an organization’s operations. There were three talks – two macro, one micro – that resonated with me in this regard.
The first was the closing keynote by Mike Byrne called “broadband investment = geospatial investment.” As GIO at the FCC, Mike had a very public early success in using open-source geospatial software to deploy the first National Broadband Map. He continues to work in the broadband field in the private sector and is an expert on how good public policy has helped drive its expansion, especially in rural America.
His talk carried two main messages. First, that technology – open-source or otherwise – is relevant only if it is aligned with policy to solve a problem. He talked in terms of public policy but this is also true of corporate policy. Good policy focuses the development of technology and drives its continual improvement.
His second message is that policy without funding is not policy at all. He contrasted the evolution of funding for broadband expansion with the Geospatial Data Act which has never received a single dollar of funding to drive the lofty goals in its text. Broadband expansion is an intensely geospatial process and it has probably directly generated more geospatial innovation and data than the Geospatial Data Act. In fairness, there has probably been a lot of indirect innovation in response to the existence of the Geospatial Data Act, but I think Mike’s point is generally accurate: Good public policy drives innovation and funding is a necessary component of good public policy.
The takeaway here is that the open-source community needs to be much more proactive about talking about money and the need for funding to make its projects sustainable.
This brings me to the second talk – Howard Butler’s talk on “Building the GDAL Sponsorship Program.” When I close my eyes and picture Howard’s work in this area, it looks a lot like the poster for the film “The 300” with Howard in the foreground. His work in this area is a reference implementation for building a funding stream for small projects that have a huge impact. GDAL is a foundational project and its sponsorship program enables private sector entities to ensure that it remains viable for them.
If one finds a photo of Howard’s slide showing the participants in the sponsorship program, then one will see large corporations – including one that may surprise some. I am talking about Esri here. Yes, Esri supports GDAL financially. Esri is a billion-dollar company (we think – it’s privately held) and the amount it contributes is basically a rounding error for them – but that’s the point. In last year’s closing keynote, Paul Ramsey made the point that you can do a lot for very little with open-source. When I asked Howard about it, he described the amount as material for the project.
That’s really the point. Financial support needs to be appropriately sized. Hypothetically speaking, if Esri were to suddenly dump some astronomical amount of money on the project – it would probably be overwhelmed and not use the funds effectively, which could sour Esri on continued support for the project.
So to Mike Byrne’s point, Howard’s willingness to talk money with corporations who derive value from GDAL has led to a sponsorship program that yields appropriately-sized funding that does not place a huge burden on the program sponsors. It is also a lot of work to keep it going. I think that’s important to note. The work on getting this one project funded will never be complete.
This talk caused me to have a thought. Some participants in this sponsorship program have taken a lot of grief from the community (think Esri here) for various reasons. If, when shown a path to be a “good corporate citizen,” (my words) a company takes that path, even if it takes a lot of cajoling, we may need to give that company some breathing space so they can figure out how to support the other open-source on which they depend. Continuing to pound them out of habit disincentivizes continued participation in the community.
The GDAL sponsorship seems to be successfully structured to incentivize companies to mitigate the risk of their dependency on GDAL by supporting it rather than forking it or replacing it. There is almost no way insourcing GDAL or developing a replacement capability can make business sense. The entry point for a gold level sponsorship is $50,000 USD – a fraction of the salary of a senior developer. Yet, in aggregate, all of these sponsorships help keep GDAL going.
This last point, incentivizing support, rather than forking, building, or buying, leads me to the last process-oriented talk that resonated with me. Dan Pilone, the CEO of Element 84, gave a talk called “Build vs. Buy … vs. Open-Source?” This was a micro-level talk that discussed Element 84’s experience with inserting the use of open-source into the classic build vs. buy decision process.
Integrating open-source into your operations can be a challenge and it’s easy to make mistakes that paint you into a corner. Element 84 seems to have made a lot of them and Dan gamely shared many so we could learn from their experience. In the end, though, he shared a best-practices framework that has emerged from their experience. I sincerely hope that Dan’s slides are posted. Most organizations are dealing with hybrid environments where proprietary and open-source software live side-by-side and Element 84’s experience is instructive. I would invite Dan back to give this exact talk for the next 10 years.
People
Open-source is about community and community is about people. FOSS4GNA drew a few hundred smart and dedicated people to learn about and discuss community-developed software. On a personal level, I reconnected with a lot of people I hadn’t seen in as long as 10 years. I also met a lot of new people – the team sent by sponsor T-Kartor were all fun and genuinely hospitable. Their presence elevated the conference and was a high point for me.
I have to take a moment to recognize the people who most made this a meaningful experience for me: the FOSS4GNA organizing committee. If I try to name everyone, I will miss someone, but I am grateful to have worked with each of you on this.
I met a lot of early-career professionals who are trying to learn about open-source, which speaks well for future sustainability. I was also happy to see people from open-source-adjacent organizations such as OGC in attendance and sitting in multiple sessions. Open standards have powered the development of open-source software by enabling a loose coupling of projects. For example, I’m not sure how much the developers of QGIS and GeoServer talk to each other, but OGC standards have enabled the integration of the two. In short, they have enabled this community.
I have led two billing system migrations over the last few years. The experience of dealing with a total lack of standards in the same vertical and fighting with systems designed for lock-in has given me a greater appreciation of the effect OGC standards have had on our community, even if those standards are not perfect. Why am I talking about standards in a “people” section? Because standards are developed through the agreement of people.
In the end, I am grateful for every person who attended FOSS4GNA. After all of the work put in by the committee, it was gratifying to see such a positive response.
Technology
I sort of had a mission for attending FOSS4GNA in terms of technology. LLMs, Apache Spark (underneath Databricks), and PostGIS all figure prominently in my current work and there was a lot of good content on all of them.
The conference had a really strong program around AI and there were several interesting LLM use cases. None were exceptionally far along in maturity but innovation continues. I generally prefer starting with pre-trained models and the talks I saw bolstered that approach. It’s possible that a truly geospatial LLM may require training a neural network up from scratch but fine-tuning pre-trained models seems to be getting a lot of mileage.
On Monday I attended the Apache Sedona workshop put on by Wherobots. Broadly speaking, Sedona can be thought of as “PostGIS for Spark.” I am starting to work with it through Databricks but this workshop gave me a really good jumpstart. It also gave me hands-on experience with Wherobots, which I had not yet used. It’s a really nice data science platform and a compelling alternative to Databricks. It’s definitely worth creating a free account and checking it out. I think it’s probably not as far along on the “government clouds” as Databricks (if it’s there at all), but that’s only a matter of time.
I also attended a talk by Elizabeth Garrett Christensen of CrunchyData on PostgreSQL optimizations. She is an expert on PostgreSQL performance and I learned a lot. It’s a fairly mature and complex platform and her expertise in optimizing different deployment strategies was readily apparent.
Sort of related to my stated interest above in sustainability, I’m interested in lifecycle support for open-source. Basically, that’s how an organization can ensure that its open-source assets remain secure, receive patches, and are usable. There are numerous ways to accomplish this, but various forms of managed services are emerging to fill these gaps. FOSS4GNA sponsors CrunchyData, GeoSolutions, and Tembo, among others, are examples of this concept. This approach provides an off-the-shelf way for organizations to acquire open-source assets and the support they require and provide a revenue stream for continued support of the underlying projects. Paired with a GSA schedule, this could make open-source more accessible to government users.