Agile for Data Warehousing

From AgileOpenNorthwest

Jump to: navigation, search

This session was focused specifically on Data Warehouse projects within an Agile development environment.

A group of us agreed to get together at some as yet undetermined time and place to continue the discussion. If you are interested in joining in on the discussion and/or extending that to form an Agile Data Warehouse user group, please contact Lynne Meddaugh (lmeddaugh@gmail.com).

Some of the challenges discussed include:

  • Slicing stories
  • How to fit in analysis on source data and schema designs
  • The balance between small vertical stories and the need for larger vision to create optimal warehouse design

Suggestions for vertical stories:

  • Front-end only (BI report) - Gives feedback first and then build backend design as pieces of that.
  • Stage data from source as is; no transformation. Slowly transform data as needed in subsequent sprints.
  • Focus on one aggregate at a time

Instead of doing vertical cuts from source to BI, 2 suggestions were made for creating sprints:

  • Data source 1 > ETL > Report/Visualization > Backfill. Follow up with Data source 2 (and beyond).
  • Analyze source data > Design and create schema > Source to stage > Stage to dimensions > Fact load > Aggregates > Backfill

Questions:

  • Can warehouse design really be cut vertically when it could take extensive backend work to create one report?
  • How do you create stories around the much-needed source analysis and schema design?
  • How do you analyze and design in small pieces when good warehouse design is so critical and so difficult to correct once implemented?

Methods of getting to smaller stories:

  • Each story should ask a specific question (ie- how many repeat visitors do we get on our sites within a 30 day window?).
  • If someone requests a report, ask how the report will be used.

Here are some links that may be relevant:

Personal tools