Data warehouse (DW or DWH) also known as an enterprise data warehouse (EDW)
What is Data warehouse (DW or DWH):
There is a lot of different definitions for a data warehouse. The simple and popular definition was first defined by Bill Inmon in 1990.“A data warehouse is a subject-oriented, integrated, time-varying, non-volatile collection of data in support of the management's decision-making process."
-- Bill Inmon
Subject-oriented: Used to analyze a particular subject area. For example, Sales, Customers, Patients, Students, Products etc.
Integrated: A data warehouse integrates, consistent naming conventions, formats, encoding structures from multiple data sources into a consistent format. Data should be free of issue such as naming conflicts and variation among units of measure.
Time-Variant: Historical data is stored in a data warehouse i.g. trends and changes for reporting. For example, Query of data can be from any time of period. OLTP may hold the most recent records, whereas data warehouse can hold all records.
Non-volatile: Data once entered into the warehouse, data is never change or delete. The data is fixed, read-only, and retained to generate report and it enables you to analyze what has occurred.
A single-subject data warehouse is typically referred to as a data mart, while data warehouses are generally enterprise in scope. Also, data warehouses can be volatile. Due to the large amount of storage required for a data warehouse, (multi-terabyte data warehouses are not uncommon), only a certain number of periods of history are kept in the warehouse. For instance, if three years of data are decided on and loaded into the warehouse, every month the oldest month will be "rolled off" the database, and the newest month added.
-- (Source: "What is a Data Warehouse?" W.H. Inmon, Prism, Volume 1, Number 1, 1995)
A data warehouse is "a copy of transaction data specifically structured for query and analysis".
-- Ralph Kimball, book-"The Data Warehouse Toolkit, Page 310"
“A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.”
-- Barry Devlin, IBM Consultant
Below is the list of point to remember:
- A data warehouse is a centralized by integrating data from multiple heterogeneous sources.
- It stores the Large volume of data (GB/TB).
- Updates infrequent and append-only.
- Queries are very long and complex.
- More number of users connect to it (e.g. decision-makers, analysts).
- It is used to analytical reporting, ad hoc or user querying and analysis for decision making.
- A data warehouse is a relational database that contains historical data for analysis.
- In Data warehouse update or delete doesn't happens and it is kept separate from OLTP system.
- Data warehouse environment includes an ETL process (Extract, Transport, and Load), an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data, transforming it into useful, actionable information, and delivering it to business users .
- Maintaining historical records
- Various data mart is used that is a simple form of a data warehouse that is focused on a single subject (or functional area), such as sales, finance or marketing. Data marts are often built and controlled by a single department within an organization.
- Click on the link to know about Data Warehouse Architecture / Diagram / Components
This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading ,
ReplyDeleteyour post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.
Regards
Informatica courses in Chennai|Informatica institutes in Chennai