Data warehouse

In 1990 Bill Inmon formulated the term “data warehouse”. A data warehouse according to Inmon is a subject-oriented, distributed, time-variant and non-volatile data collection. This data assists in evaluating an enterprise to take educated decisions.

A data warehouse brings us structured and integrated data from a multidimensional perspective

A data warehouse also provides us with Online Analytical Processing (OLAP) tools along with a simplified and integrated view of the results. These tools help us analyze data in a multidimensional space in interactive and effective way. This analysis leads to generalization of data and to data mining.

  • A data warehouse is a database, which is kept separate from the organization`s operational database
  • There is no frequent updating done in a data warehouse.
  • It possesses consolidated historical data, which helps the organization to analyze its business.
  • A data warehouse helps executives to organize, understand and use their data to take strategic decisions.
  • Data warehouse systems help in the integration of diversity of application systems.
  • A data warehouse system helps in consolidated historical data analysis.     

How is a data store isolated from operating databases?

For the following reasons, a data warehouse is kept separate from operational databases-

  • An operational database is constructed for well-known tasks and workloads such as often searching particular records, indexing, etc. In contract, data warehouse queries are complex and they present a general form of data.
  • Operational databases support concurrent processing of multiple transactions. Concurrency control and recovery mechanisms are required for operational databases to ensure robustness and consistency of the database.
  • An operational database query allows to read and modify operations, while an OLAP query needs only Read Only access of stored data.
  • An operational database maintains current data. On the other hand, a data warehouse maintains historical data

Data Warehouse Features

  • Subject Oriented – A data warehouse is subject oriented because it provides information around a subject rather than the organization`s ongoing operations. These subjects can be product, customers, suppliers, sales, revenue, etc. A data warehouse does not focus on the ongoing operations; rather it focuses on modeling and analysis of data for decision-making.
  • Integrated – A data warehouse is constructed by integrating data form heterogeneous sources such as relational databases, flat files, etc. This integration enhances the effective analysis of data.
  • Time Variant- The data collected in a data warehouse in identified with a particular time period. The data in a data warehouse provides information from the historical point of view.
  • Non-volatile-Non-volatile means the previous data is not erased when new data is added to it. A data warehouse is kept separate from the operational database and therefor frequent changes in operational database is not reflected in the data warehouse.

Types of data warehouse

Information processing, analytical processing and data mining are the three types of data warehouse applications that are discussed below-

  • Information processing – a data warehouse allows to process the data stored in it. The data can be processed by means of querying, Basic statiscal analysis, reporting using crosstables, tables, charts, or graphs.
  • Analytical Processing- a data warehouse supports analytical processing of the information stored in it. The data can be analyzed by means of OLAP operations, including slice-and-dice, drill up, and pivoting.
  • Data mining- data mining supports knowledge discovery by finding hidden patterns and associations, constructing analytical models, performing classification and prediction. These mining results can be presented using the visualization tools.
S.NoData Warehouse(OLAP)Operational Database(OLTP)
1It involves historical processing of information.It involves day-to-day processing.
2OLAP systems are used by Knowledge workers such as executives, managers, and analysis use OLAP systems.OLTP systems are used by clerks, DBAs, or database professionals
3It is used to analyze the businessIt is used to run the business & it is application oriented.
4It focuses on information out.It focuses on data in.
5It is based on star schema, snowflake schema, and fact constellation schema.It is based on entity relationship model.
6It contains historical data.It contains current data.
7It provides summarized and consolidated data.It provides primitive and highly detailed data
8The number of users is in hundreds.The numbers of users is in thousands.
9The number of records accessed is in millions.The number of records accessed is in tens.
10The database size is from 100 GB to 100 TB.The database size is from 100 MB to 100 GB
11These are highly flexible.It provides high performance

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *