Want to build a kind of a scientific fact search and analytical engine for a lot of different types of measurements hosted in a lot of source tables and columns.
A star join schema is built and data is copied (ETL) from the sources as usual - but all source column values are stored into a single fact-table value column and each column name is represented by a surrogate foreign key referencing a dimension table where the source table and column names are prestored.
There are several other dimensions as well, as usual. In this design ER-structures in the source database are deliberately abandoned, the only structure that is preserved is repeating groups through a simple serial number dimesion, so that e.g. time series values and depth intervals (stratigraphy) are chained. One might have a second fact value column for alphanumeric codes.
There will be a lot of rows in the fact table, but rather short ones. The star join database can be searched with SQL and simple statistics may be presented or one may take resulting datasets to other analytical tools, e.g. Excel. The SQL in queries may sometimes be tricky in this design. One might create (with SQL-scripts) smaller and specialized data marts on different topics from this data warehouse.
An interesting quality is that space coordinates in the fact table open up for unlimited possibilities to select and present data on digital maps from a GIS tool.
/RWg, Geological Survey of Sweden.
Oracle SQL*Plus, ArcView, MapInfo
November 25, 2009 12:07 PM
December 22, 2009 10:33 AM