By: Caitlin,


Here at Dwolla, one of our core beliefs is that money = data. However, there is more to this than just dollars and cents—it takes careful investigation from a dedicated team working to answer questions and measure products. Over the last year we’ve been building a data platform to facilitate this measurement. In order to do this, we needed to effectively collect and analyze data.

Gathering atomic events

Database design has typically thought of atomicity in relation to database transactions, operations that are all or nothing, indivisible or irreducible. However, for data to be semantically atomic, it needs to have meaning that is indivisible or irreducible. This goes beyond just the current state of something, but considers the timing and how that state has transitioned or changed. We need semantic atomicity because:

Events are a natural fit to provide the flexibility and data we need. Events describe what happened in their identity and structure, when it happened, and as time based facts, are a concept that others have also embraced. Since these events are atomic and are the smallest pieces of data, we can then compose events together to answer future questions.

Once we’ve defined these specific and immutable (unchanging) events, we simply need to gather it at scale and make it queryable. This allows for us to quickly transform this data to easily answer any question.

Analyzing events at scale

We now have a proliferation of data, a metaphoric explosion of events in “big data”. However, since we have built our data platform on Amazon Web Services, we are able to leverage the following infrastructure:

This suite of tools allows us to then analyze data across the data structure spectrum.

tc15-data-tool-spectrum-1-1024x633In designing our data architecture we strive to adhere to these three principles:

Events are immutable and data transformation is a one way street. Because of this, we can archive, tear down, replace, and calculate values derived from our events. As long as we have the original atomic events, data at its source is never lost.

We’re especially excited to release parts of our data stack as open source and will be presenting some of our experiences at upcoming events like Tableau’s TC15 conference: “Dwolla, Tableau, & AWS: Building Scalable Event-Driven Data Pipelines for Payments”.

We’ll be taking a deep dive into some of the concepts I have shared as well as the nuts and bolts behind our data architecture, in particular, how we’ve automated our pipelines with a soon to be released open source project, Arbalest. Finally, we’ll show how applications (like Tableau’s business intelligence platform) can leverage our data platform.


fred galosoThis blog post shares insights from Fredrick Galoso, a software developer and technical lead for the data and analytics team here at Dwolla. Fred has led the creation of our data platform, including its data pipeline, predictive analytics, and business intelligence platform. In his free time he contributes to a number of open source software projects, including Gauss, a statistics and data analytics library.



Financial institutions play an important role in the Dwolla network.

Dwolla, Inc. is an agent of Veridian Credit Union and Compass Bank and all funds associated with your account in the Dwolla network are held in pooled accounts at Veridian Credit Union and Compass Bank. These funds are not eligible for individual insurance, including FDIC insurance and may not be eligible for share insurance by the National Credit Union Share Insurance Fund. Dwolla, Inc. is the operator of a software platform that communicates user instructions for funds transfers to Veridian Credit Union and Compass Bank.