How much of your data practices should you decentralize to different teams? What controls do you keep in a central team? How do you organize decentralized data ownership? These are queries our clients bring to our consultants every week. For example, in our last CDO dinner, executives preferred discussing self-service organization structures over GenAI – even though GenAI is booming! For them, the real data issues lie in the organizational transformation, and this is precisely where the data mesh concept can shed some light.
If you also have similar questions, keep reading to find out what the data mesh can do for your company. And also sign up for our webinar Data Mesh – Where Are We Know? on September 16.
Data Mesh - Where are we now?
The Data Mesh is an architecture paradigm that advocates decentralizing your data practice to make it as scalable as possible. The idea is to remove a central bottleneck in a hyper-specialized engineering team, by giving more autonomy to decentral teams. Zhamak Dehghani introduced it back in 2019, and there have been many developments since.
Many companies have implemented the data mesh organization structure during these past 5 years with mixed results. And many are still struggling in their journey to implement it. Here are four key findings we have put together based on our client experience.
First, almost nobody has implemented the concept the way it was originally thought of, that is, to do everything decentrally (scenario 5 in Figure 1). We have seen all kinds of variations that depend on the size of the organization, its vision of self-service, and the maturity of the data users. While inexperienced users might be content with ready-to-use dashboards offered by a central team, more mature users require the chance to combine and transform data themselves.
Figure 1: Different decentralization scenarios.
Only big corporations with high data maturity and lots of investment in data successfully implement the full data mesh (scenario 5). For smaller organizations, it is more advisable to take a variation of the data mesh that will fit your needs. If your organization is really small, it’s probably better not to implement it at all. Ultimately, your decision to implement the data mesh concept depends on how many teams you need to self-service, and how mature these teams are.
Second, some companies have favored a hub-and-spoke model over the data mesh. In terms of infrastructure, these two models are very similar. However, in the hub-and-spoke model, granting access and sharing data between departments is still very much centralized. In the data mesh, departments can decide where to share data, when, and with whom themselves.
Third, the tech stack has come a long way in supporting the decentralization of architecture. To stay in control of a decentralized data mesh, the collection and analysis of metadata is crucial. Among the tools we like to use to implement the data mesh are dbt, which tracks lineage of data in transformations, and a catalog like data.world that is used for increasing the observability of your data products by visualizing metadata centrally.
Fourth, data literacy is key to implementing the data mesh. Rolling out self-service across the organization requires advanced training for users, developers, or core engineers. In addition, understanding of data management and governance practices is essential to make people aware of how to deal with data quality issues or ownership. Everyone must be on the same page on what it means to manage a data product.
So, where do I start?
As you can see, implementing the data mesh isn’t easy. We always recommend starting with a solid data strategy to decide which structure suits your company and in which way. Then, the right investments in people, processes, data, and technology will help you implement the model successfully. Please do not underestimate that this is a significant change in the way of working. Executive sponsorship and clear communication are key.
IIf you want to learn more about how Data Mesh is implemented in practice, watch our on-demand webinar Data Mesh – Where Are We Know? In this webinar, we explained how the Data Mesh concept has evolved over the last 5 years and how it can benefit your organization. Three experienced practitioners shared their insights and answered all your questions. We hope to see you there!
Co-author: Juan Vegenas