As modern analytics continues its exciting evolution, it’s more important than ever to be mindful of some of the foundational elements that allow an organization’s analytics strategies to mature and evolve. Today, advanced analytics, data lakes, and cloud-based solutions dominate our industry headlines, and while these things will no doubt prove indispensable to any venerable analytics practice, many of us are still faced with learning to walk before we can learn to run – and the time-tested data warehouse remains today just as relevant a cornerstone to a budding analytics practice as it did a decade ago.
Why a Data Warehouse?
There’s just no getting around it: before stakeholders can turn their data into insights, that data needs to be structured in a way that lends itself to being easily understood – a central prerequisite to any analytics solution. Turning data into insights is, and should be, the ultimate goal, but it is difficult to build effective and advanced solutions without having a strong model to stand on.
Of the many data models designed to help organizations build a durable analytics platform, the Kimball dimensional model is the gold standard. It pushes data modelers to focus on building something that is fast, and, more importantly, something that is easily understood.
After all, if an analytics solution does not impart understanding and insight, then what is it doing? While the technical requirements to building a data warehouse are generally well known and understood at this point by the industry, we should know that lurking behind IT’s requirements checklist is a far more critical (and almost completely non-technical) prerequisite: organizational consensus.
As consultants, we have a saying in the Business Intelligence/Analytics space when it comes to centralizing data as part of an analytics solution, and that is: “We want one version of the truth.” In other words, for the audience served by your analytics solution – and by extension the data warehouse within that solution – we want to use one set of terminology, one set of “rules,” and/or one set of agreed-upon use cases. How else do we arrive at “truth” if we are not using a common language to describe an agreed-upon definition of the process or organization that we seek to analyze? Quite simply, we cannot.
In fact, as the scope of a solution grows in size and in complexity, it becomes more critical to have a single truth in order to be successful. Indeed, simply deciding the scope of the solution itself should help organizations come to a consensus, but it isn’t always that simple when there are questions such as:
- Should the data warehouse merely be a massive structured repository of as much atomic-level data as we can collect?
- Should we seek to instead build data “marts,” defined as collections of use cases and intended to serve well-defined target audiences?
There is no one-size-fits-all answer here. Rather, these questions can only really be answered through meaningful discovery and conversation, ultimately allowing teams to arrive at a point of mutual agreement.
Arriving at Consensus
Somewhat unique to analytics solutions is the unvarnished way in which data can paint a picture of a process, or an organization, or a way of doing business. Much as the weather doesn’t particularly care what the forecasters are calling for, raw data begins to tell a story that is – especially as we continue to procure it from more and more sources – largely agnostic to any one interpretation or point of view.
For example, data generated by a process of selling goods or services may ultimately end up in the hands of both sales personnel and accounting staff – often undergoing quite a bit of disparate conditioning and massaging along the way in order to suit the particular needs of each division. When tracking toward a data warehouse as a singular repository of data, what will be the agreed-upon structure, naming, and rule sets that the data warehouse should incorporate?
The answer here is not particular to some hypothetical question like the above, but rather to point out that it’s still up to people (the stakeholders and the consumers), and the establishing of process, to determine these answers before any sort of computer architecture can be expected to. Know your audience. Know the use cases being presented. Bring those with both relevant knowledge to the processes being discussed, and an open mind to change, to the table. Give them a chance to be heard. And only then worry about knowing your source systems, or what keys your tables should use. After all, improved insight often brings about change, as we very rarely open our eyes only to discover that we’ve been doing things in the dark perfectly all along.
Do you want to learn more about building and maintaining your data warehouse? Contact us today! We are always happy to provide ideas and insights for you and your organization.