Utilisateur
- Operational Databases
- Data Warehouses
- Preprocessed data
They run fundamental business tasks
The operation is On-Line Transaction Processing (OLTP). For example, banking or sales transactions, course registration, etc.
Driven by the need for decision support
They help with planning, problem solving, and decision-making
On-Line Analytical Processing (OLAP). For example, data summarization, data aggressions, etc
Decision makers (managers, analysts, executives, etc.)
Driven by the need for complex decision support functions
To help with more complicated decision making and prediction
Data mining. For example, assocaition, classification, clustering, etc.
Decision makers
1) Save time building reports
2) Slice and dice in ways you could not do before
- It is a data store (e.g a database of some sort) that integrates data from different sources throughout an organization for decision-support purposes
- A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management's decision-making process
- Subject-oriented
- Integrated
- Time-variant
- Non - volatile
- Organized around major subjects, such as customer, product, sales.
- Excluding data that are not useful in the decision support process.
Constructed by integrating multiple, heterogeneous data sources
- Stores historical data (such as the past 5 or 10 years),
- The "time" dimension is important
- Does not store the current data as in operational databases
- A physically separate store of data transformed from the operational environment
- Operational update of data does not occur in the data warehouse environment
- Does not require transaction processing, recovery, and concurrency control mechanisms
- May require periodical update of data
- Much less frequent than operational update of data
- It is the process of combining data from multiple sources into a large, central repository called a data warehouse
- Data warehouse systems use back-end tools and utilities to populate and refresh their data
- Data extraction
- Data Cleaning
- Data Transformation
- Load
It is based on a multidimensional data model
- Dimensions. Examples include item, time, and location
- Fact (about a subject). Examples include numerical measures like sales in dollars
- Different dimensions
- Single Subject or multiple subjects
False. A data warehouse may contain data cubes of different subjects
To aggregate a measure over one or more dimensions
- It stands for On-Line Analytical Processing
- It allows for fast analysis of multi-dimensional data
- It views the multi-dimensional data from a different perspective
Roll-up
- Drill-down
- Slice
- Dice
- Pivot
- It summarize the data by climbing up a hierarchy of dimensions or by dimension reduction
- For example, given sales by city, quarter and product, we can roll-up to get sales by province, quarter and product
- It is the inverse of a roll-up
- From a higher-level summary to a lower-level summary or detailed data, or introducing new dimensions
- For example, Given sales by province, quarter and product, get sales by city, quarter and product. Another example is given total sales per quarter, get total sales per quarter for each product.
- Equality selections on one or more dimensions
- Selection on one dimension.
- Range selections on all dimensions.
- Selection on two or more dimensions
- It re-orient the multi-dimensional view/the cube so that you can view the data from a different angle for visualization purposes
- For example, a financial analyst might want to view or "pivot" data in various ways, such as displaying all the cities down the page and all the products across a page.
- OLTP is the major task of traditional relational DBMS. The day-to-day operations include purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc.
- OLAP is the major task of data warehouse system. It is used for data analysis and aggregation for decision making