Basic Core Data Concepts
Data
Collection of facts, numbers, descriptions, observations, any information, any useful information
Different types of data
- Structured data is typically tabular data that is represented by rows and columns in a database
Eg – Datawarehouse, CRM, ERP
- Semi-structured data has some organizational framework but does not have complete structure that is required to fit in a relational database.
Eg – CSV, XML, JSON
- Unstructured data is data which is not organized in any pre-defined manner
Eg- Audi and video files, binary data files
Data storage in cloud
- Structured data – Azure SQL Database
- Semi-structured data – Azure Cosmos DB
- Unstructured data – Azure blob storage
Data Processing Solutions
OLTP (Online Transaction Processing) is called transactional system. That means system which can record transaction. Day-to-day handling of transactions that result from enterprise operation
- Small, discret, unit of work
- Often high in volume
- Data processes quickly
OLAP(Online Analytical Processing) is called analytical system. Here, we analyze the information in a database, OLTP database where we collected all this information
- Analysis of information in a database for the purpose of making management decisions
- Big picture view of the information held in a database
- Generate insights to make business decisions
How does the data flow from transactional to analytical system ?
Consider an example of Banking Industry, Data comes from various sources such as bankers, mobile apps, ATM machines etc. Data from different sources is gathered in one place i.e. we ingest data into one particular repository and after that we transform this data.
In ATM, we may have different format of data, from banker we may have different format of data. We reformat that data to make sure they match with each other. Sometimes there is inavlid data so clean that data, if the data is duplicated we do the merging, and the cleaning, and the processing of that data and finally we put that data into our OLAP system, so this is OLAP system.
On the left-hand side these are the transaction system. We get the information from transaction system (this is a simplistic view). The information is processed and stored in our analytical processing system. In OLAP we design and derive different reports which can provide us the insight and based on those insights we can take some business decisions.
Batch processing vs streaming processing
Processing means when converting raw data to some kind of meaningful information which can provide some kind of insight.
Comments
Post a Comment