I few weeks ago, I described an integration I built to pull data from the Stripe via its API and load it into BigQuery. There were two main problems with this approach: First, it was incredibly hacky – a Wile E. Coyote approach to the problem involving cron jobs and EC2 instances and GCS uploads and scheduled jobs in BigQuery. It got the job done, but in a way that was slightly embarrassing. Second, everything else we did was leveraging FME and this stood outside of that pattern.
I really wanted to bring this process into the FME fold, but I needed to tackle Stripe’s API pagination in order to do it. Most of Stripe’s helper libraries have an auto-pagination feature that allows you to keep loading objects until you reach the end. This feature is not available if you are using the raw Stripe REST API, which is essentially all that is available to FME via the HTTPCaller transformer.
Stripe’s API uses cursor-based pagination, meaning that it points to the ID of the next object to read in the dataset. Luckily, Safe Software has an article that describes how to handle cursor-based pagination. This article is that it uses the Slack API as an example. The Slack API is well designed and returns everything you need to handle pagination in one place in the response document, but this isn’t the case with Stripe’s API so I had to modify the approach somewhat.