Data WareHouse: Reports

Reports
In a DW environment, there are various ways to create reports. A report writing tool like Crystal Reports may be used. Many BI tools (like Business Objects) include tools like WEBi that enable users to create reports.
I will discuss the Reports testing in general terms, without getting into the details of any single tool.
All reports have front-end and back-end. Back-end is, of course, the DW database. Most reports access data from Fact tables.
Front end consists of the body of report, that is actually seen by the user: a chart (bar, line, pie, donut etc), or diagram (Venn etc), or just a 2-dimensional table.

Each report has an associated query (or set of queries) that runs on the database and retrieves rows. The logic for processing and displaying the retrieved data is another aspect of a report. In a BO environment, it's the Universe that contains all this logic.
Based on the discussion so far, following are the possible test scenarios for reports in a DW environment:
- Queries: As with any software, a report will (should) have a specification document. The document specifies which information is to be displayed, in which format.
Verify that the query (or queries) associated with the report does actually retrieve the data (information) from the database, that it is supposed to. This test is geared toward functionality of a report query.
Additionally, (if the scope provides for it), query may be tested for its performance (testing the joins, access paths, Explain etc).

- Output: Output of a report should be tested for the following:
. Report Inputs
. Report Layout (Bar, Pie, Donut, tabular etc)
. Filter conditions (correspond to WHERE clause)
. Grouping conditions (correspond to GROUP BY clause)

- Report Contents: This is the most important part of the reports testing. As a user, I run a report, I see the data, I am happy. BUT, how do I know that, what I see IS what it is? In other words, how do I know if the data displayed in the report is accurate? A billion-dollar question.

The answer goes back to the ETL process, that populated the DW database. If the ETL is designed properly, and it runs error-free, then we can have some level of confidence about the accuracy of the report. And that's why, it is almost always advisable to perform the ETL QA before Reports QA. I would go ahead one step further and say, that reports should not even be developed before the ETL is signed off. This may sound like paranoia, and may not always be possible in the real world.
There are two aspects to this issue. One, whether the data retrieved by the query is accurate. Two, whether the data processing logic (build into the report) is accurate.
Hence, two sets of tests are required.
- The associated query is run on the DW database. Remember, the data retrieved by the query is RAW data. It is not yet processed. This data needs to be compared to the source data (I mean, the source of the ETL). This can be littlebit tricky, since the source can be almost any data structure. If the source is also a database, then the report query can be 'translated' to run on the source database, and the retrieved data can be compared with the data retrieved by the reports query.
Again, remember, that the two sets of data (source and target) may not be identical, since the whole ETL logic is sitting in between them. So, some level of manual judgement is involved here.

If there are any discrepancies at this step, then it points to the ETL design (assuming that the reports query is accurate).

- Once we are sure that the data we got is accurate, then we need to know how it is processed, in order to bring it to the final output format. For example, if we have a bar chart, that shows percentages of revenue growth across regions, then we should process the raw data to calculate the required percentages and compare them with what we see in the report. This way, we can verify the report output.

Data WareHouse

Thursday, August 20, 2009

Reports

No comments:

Post a Comment

My Blog List

About Me

Followers

Blog Archive