scrum for enterprise data warehouse and business intelligence_

Scrum is difficult to master. It even says so in the official Scrum Guide. But the lessons you learn on your journey to mastery are very important. So with 17+ years in BI and 8 years on my path to Scrum mastery, I’ve been through IT’s failures to deliver insightful and timely “Business Intelligence”. BI, and especially Data Warehousing do not scream Scrum when you think of them. But there are certainly ways to tame the beast that is Scrum for EDW/BI.

Let’s start with the known issues an EDW/BI project brings to the table for Scrum.

  • Complex requirements – Shifting requirements, many data sources, varied data grains and load cycles by source.
  • Many users from all areas of the business – Everyone uses BI in some way, from back office workers to the CEO, all with different needs and levels of information access. If there ever was a monolithic application, EDW/BI typically comes in first.
  • Myth: Releasing less than 100% of everything needed is not useful – Generates the demand to build the entire DB first, which as the business changes, traditionally this means your EDW is delivered late. You could try to keep up with the change as it happens but you might not ever deliver anything useful because you’re in perpetual change hell and you can’t deliver until the EDW is “finished”.
  • Desire to create the perfect schema – Due to the myth above, architects follow suit and are uncomfortable with delivering anything less than everything.
  • Automated Testing – Most automated tools aren’t setup to handle EDW/BI test, therefore manual testing is used and only adds to the delay of delivery.
  • Cross functional teams – You can make a Scrum Team which contains all skills needed to be cross functional, but you most likely still won’t finish the majority of your stories in a given two week Sprint. Most developers know either Data Modeling/ETL or BI tools, typically not both. Siloed skills mean higher dependencies and hand-offs, which waste time within your Sprint.

So how do things change for EDW/BI projects when applying Scrum? Typically, the bottleneck in any sizable project is the database design and ETL work. And therefore, it contains the largest amount of change to fit into Scrum. Consider the picture below.

Scrum for Enterprise Data Warehouse and Business Intelligence Projects_image2

The Enterprise Business Intelligence (EBI) portion of the project is typically straightforward if you are building a proper EDW, one where most the business logic is being handled by the Enterprise Information Management (EIM) portion of the stack. Doing so means that the horse power of the DB/ETL servers are being taken advantage of, and the front-end servers can then instead focus on their typical job, mainly visualization generation and caching commonly used data sets for quick retrieval.

With so much of the project work being reliant on the EIM developers, you might ask then how can you possibly create functioning slices of an analytical/enterprise reporting system incrementally?

The trick is simple, visualize the end result first. Doing so will focus the EIM design and development on small batches of work, more specifically the work required to create the desired front end result. If that vertical slice of functionality can’t be completed through a typical “Definition of Done” including design, development, unit testing, deployment to QA, QA itself, and Product Owner approval within your Sprint then think smaller.

Below are some ideas to break up the work and increase throughput:

  • Wire Frame the End Result – This can be accomplished via a whiteboard, tools like Excel, Visio or Gliffy, or even as simply as drawing on paper.
  • Use Components – If your dashboard or report is too large to complete still, discuss how you can break up the deliverable into parts. This might be more difficult for a single report, but works great for dashboards. Dashboards usually have multiple charts, screens, and navigation. All good candidates for creating vertically sliced user stories.
  • Use Business Concepts – Usually the business has specific categories of questions they would like to gain insights from, and those questions can be grouped into themes or “Business Concepts”. Each concept would be a single star schema in the DB with conformed dimensions. This way dimensions can be reused from concept to concept where applicable. These groupings of facts and dimensions can support enterprise level analytics, as well as self-service like data discovery. Each Business Concept contains all the needed tables to answer the questions from that theme.

Scrum for Enterprise Data Warehouse and Business Intelligence Projects_image3

Next let’s tackle how Scrum changes for EDW/BI projects. Surprisingly little actually. Let’s have a look by role:

Product Owner for BI

  • Complex requirements – Preferably the PO has a BI background. Someone like this might be hard to find, so you might be better off finding the person most passionate about the company data from within the business and teaching them to be the PO. Handling the myriad of ever changing requirements coming in from all directions will be challenging otherwise. Teach the PO to lean on their data SMEs from within the business. They might be one themselves, but they most likely don’t know everything. Backlog Refinement will be key for the project’s success, as the PO and Data SMEs will need to collaborate often on the content and priority of the backlog. The quest of the PO will be to find quick value wins from within what might seem like a mountain of complex requirements.
  • Many users from all areas of the business – Since there will be a large cross-section of end users, you most likely will want to mimic that with your EU testing. Challenging it will be to find the right mix and size, as too many people will slow you down, but the wrong people can also hurt your quality and value. Find people that can be reliably available on the cadence of your sprint cycle. Quicker feedback is an important reason for doing Scrum in the first place. If we’re going to stumble, let’s get it over with and on the right track as quickly as possible. Therefore, Sprint Reviews will be key to your progress. Make sure the PO knows how to orchestrate this highly visible meeting, and make sure they are efficient with using the front-end tool.
  • Myth: Releasing less than 100% of everything needed is not useful – Data is increasingly becoming more and more important to a company’s success. However, unsuccessful implementations have plagued our industry for years and has spawned an industry trend to enable the business to circumvent IT. So, if you have read this far, and you’re still not convinced there are ways to iterate and incrementally deliver data then I hope you can tread water. In this “Age of Acceleration” business moves at a pace that is hard to keep up with. Not releasing early and often will eventually become your sinking ship. Use Business Concepts and components to your advantage and write User Stories around them. I’ve been using these for 10 years, and they haven’t failed me yet.

Scrum for BI Development Team

  • Desire to create the perfect schema – Data Warehousing has been done one way for a long time. Teaching someone a new way is hard, it takes time and patience. The “perfect schema” is the carrot on the stick, architects should be constantly chasing it; it should not become the destination. Visualizing the schema design from the front-end requirements backwards, and doing that iteratively makes some people very uncomfortable. Remember though, there is going to be rework regardless, the business is going to change their mind often. You can build it wrong now and know it sooner, or you can build it wrong now and know a few months from now. Regardless there will be rework to do. The “perfect schema” will never become reality until the backlog is empty.
  • Automated Testing – This is the single most difficult hurdle to overcome with EDW/BI projects and Scrum. There is no way of being successful with Scrum unless you are using automated testing tools and you have a clean, known, and small data set to test. Maybe you take a production copy of data and pare it down to include known test cases of data that can be repeatedly run as the code is enhanced. Or mock up data to fit those scenarios, regardless the person who typically does your EDW/BI testing will need to change their testing skill set to incorporate automated testing tools. The best one I’ve found is MotioCI, however it only works with IBM Cognos which is unfortunate. Would love to see this tool work for other BI platforms like SAP Business Objects, MicroStrategy, and Microsoft Power BI. Maybe you can find the equivalent already available.
  • Cross functional teams – This is probably the second most difficult thing to overcome. By definition, a cross-functional team is a group of people that cover the required skills to accomplish any given set of work from a backlog to a point of “Done”. However, that alone does not completely solve the problem as siloed skills will still largely exist. The ultimate goal here is to cross train the team as quickly as possible while keeping an average velocity that is still respectable to the PO. That sometimes is easier said than done. So maybe instead you focus on finding ways to improve the speed of hand off between the various silos. Applying Kanban to Scrum can certainly help with this approach. You might find that using development tools that are well suited for Scrum development will aid in decreasing your cycle time. I’m looking at you Embarcadero and WhereScape. These tools in the hands of experts are amazing!

Scrum Master for BI

Being a Scrum Master on a EDW/BI project is fun if you like challenges. It’s hard to find an experienced person that knows enough to understand the conversations that surround these complex projects. Your typical SM might fail miserably if they are coming from application development teams. It is a rare breed of skills to understand data modeling, ETL/ELT, semantic modeling, BI development, automated testing, release management, and be proficient enough in Scrum to help a team stay sane on projects the size, speed, and complexity of your typical EDW/BI project.

Some things you’ll want to keep in mind:

  • Make sure each sprint, even the first one, has at least one user facing story. Work with the Scrum Team to find that diamond in the rough. Each new sprint should be closer to being completely full of vertically sliced/business facing value stories.
  • Keep the Scrum Team honest about sticking to vertical slices of functionality. They will readily gravitate back to the status quo and want to create horizontal slices.
  • Keeping the PO and Data SMEs engaged will be key as keeping a healthy backlog of work readily available for the team will keep everyone happy.

I hope you found this article helpful. If you are interested in incorporating Scrum into your next BI project, contact a Data and Analytics professional.