Sunday, June 7, 2015

BI DWH OLTP DRAFT

Introduction to Basic BI Concepts
What is Business Intelligence
Definitions
•Howard Dresner’s definition - business intelligence is an umbrella term to describe ‘concepts and methods to improve business decision making by using fact-based support systems’
•Business intelligence (BI) is the set of techniques and tools that transform raw data into meaningful and useful information for the purpose of business analysis

What can we understand from the Definitions?
•concepts and methods / techniques and tools
–BI is more than a set of tools and technologies
•business decision / business analysis
–BI is driven by the need of business to take informed decisions
•fact-based / raw data  useful information
–Analysis is based on raw data / facts

What is data / “raw facts”
Data / Raw Facts
•Facts generated by operational workflow of a business process
–This training has generated data
»Topic, trainer name, date, duration, info about participants …
•Facts generated by non-business events
–Weather information
–Information gathered by traffic sensor
•Can be digital, paper, structured, unstructured, semi-structured
•Can be recorded somewhere or just in a person’s head


Where are raw facts / data created?

OLTP System
•An application or modules of application that automates a specific area of business operations workflow, such as pre-sales, sales, procurement, r&d, accounting etc.
•Implements current business rules that govern operations
•Focuses on managing current data slice rather than historical data
–historical data may be purged / archived to improve database performance or to reduce data storage costs or implement new business rules that earlier data does not comply with
•Focuses on managing large volumes of simultaneous transactions, involving low volumes of data per transaction
•May support operational reporting, but full-fledged analysis is not the core purpose
•Data repository usually RDBMS (exceptions are mainframe / COBOL applications etc. & non-transaction data such as ATM logs, network signals, traffic signal data, etc.). Data model is usually normalized


Saturday, June 6, 2015

OLTP - On-line Transaction Processing

Introduction to OLTP:


What is OLTP?

Definition:
OLTP stand for "On-line Transaction Processing", it is an operational database that supports or facilitates high transaction-oriented applications. Operational data means those data occupied in the daily transactions from a particular system.

It is used in traditional database that includes insert, update, and delete operations, while also perform to obtain information using on inquiry requirement.

Example of OLTP:
Banks, Airlines, Retailers, Railways etc.

Data warehouse (DWH) Introduction - Design & Concepts

Data warehouse (DW or DWH) also known as an enterprise data warehouse (EDW)


What is Data warehouse (DW or DWH):

There is a lot of different definitions for a data warehouse. The simple and popular definition was first defined by Bill Inmon in 1990.

“A data warehouse is a subject-oriented, integrated, time-varying, non-volatile collection of data in support of the management's decision-making process."
-- Bill Inmon

Subject-oriented: Used to analyze a particular subject area. For example, Sales, Customers, Patients, Students, Products etc.

Integrated: A data warehouse integrates, consistent naming conventions, formats, encoding structures from multiple data sources into a consistent format. Data should be free of issue such as naming conflicts and variation among units of measure.

Time-Variant: Historical data is stored in a data warehouse i.g. trends and changes for reporting. For example, Query of data can be from any time of period. OLTP may hold the most recent records, whereas data warehouse can hold all records.

Non-volatile: Data once entered into the warehouse, data is never change or delete. The data is fixed, read-only, and retained to generate report and it enables you to analyze what has occurred.

A single-subject data warehouse is typically referred to as a data mart, while data warehouses are generally enterprise in scope. Also, data warehouses can be volatile. Due to the large amount of storage required for a data warehouse, (multi-terabyte data warehouses are not uncommon), only a certain number of periods of history are kept in the warehouse. For instance, if three years of data are decided on and loaded into the warehouse, every month the oldest month will be "rolled off" the database, and the newest month added.
-- (Source: "What is a Data Warehouse?" W.H. Inmon, Prism, Volume 1, Number 1, 1995)

A data warehouse is "a copy of transaction data specifically structured for query and analysis".
-- Ralph Kimball, book-"The Data Warehouse Toolkit, Page 310"

“A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.”
-- Barry Devlin, IBM Consultant

Below is the list of point to remember:
  • A data warehouse is a centralized by integrating data from multiple heterogeneous sources.
  • It stores the Large volume of data (GB/TB).
  • Updates infrequent and append-only.
  • Queries are very long and complex.
  • More number of users connect to it (e.g. decision-makers, analysts).
  • It is used to analytical reporting, ad hoc or user querying and analysis for decision making. 
  • A data warehouse is a relational database that contains historical data for analysis.
  • In Data warehouse update or delete doesn't happens and it is kept separate from OLTP system.
  • Data warehouse environment includes an ETL process (Extract, Transport, and Load), an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data, transforming it into useful, actionable information, and delivering it to business users .
  • Maintaining historical records
  • Various data mart is used that is a simple form of a data warehouse that is focused on a single subject (or functional area), such as sales, finance or marketing. Data marts are often built and controlled by a single department within an organization.
  • Click on the link to know about Data Warehouse Architecture / Diagram / Components

+Vivek Kumar