Basic Core Data Concepts

Data

Collection of facts, numbers, descriptions, observations, any information, any useful information

Different types of data

  • Structured data is typically tabular data that is represented by rows and columns in a database

            Eg – Datawarehouse, CRM, ERP

  • Semi-structured data has some organizational framework but does not have complete structure that is required to fit in a relational database.

            Eg – CSV, XML, JSON

  • Unstructured data is data which is not organized in any pre-defined manner

            Eg- Audi and video files, binary data files

Data storage in cloud

  • Structured data – Azure SQL Database
  • Semi-structured data – Azure Cosmos DB
  • Unstructured data – Azure blob storage

Data Processing Solutions

OLTP (Online Transaction Processing) is called transactional system. That means system which can record transaction. Day-to-day handling of transactions that result from enterprise operation

  • Small, discret, unit of work
  • Often high in volume
  • Data processes quickly

OLAP(Online Analytical Processing) is called analytical system. Here, we analyze the information in a database, OLTP database where we collected all this information

  • Analysis  of information in a database for the purpose of making management decisions
  • Big picture view of the information held in a database
  • Generate insights to make business decisions

How does the data flow from transactional to analytical system ? 

Consider an example of Banking Industry, Data comes from various sources such as bankers, mobile apps, ATM machines etc. Data from different sources is gathered in one place i.e. we ingest data into one particular repository and after that we transform this data. 

In ATM, we may have different format of data, from banker we may have different format of data. We reformat that data to make sure they match with each other. Sometimes there is inavlid data so clean that data, if the data is duplicated we do the merging, and the cleaning, and the processing of that data and finally we put that data into our OLAP system, so this is OLAP system. 
                                        

On the left-hand side these are the transaction system. We get the information from transaction system (this is a simplistic view). The information is processed and stored in our analytical processing system. In OLAP we design and derive different reports which can provide us the insight and based on those insights we can take some business decisions. 

Batch processing vs streaming processing

Processing means when converting raw data to some kind of meaningful information which can provide some kind of insight.


     







Comments