Data Pipeline Framework

Author: Emilia Colonese

Figure 1: Data Pipeline Framework (by author).

The figure above presents a data pipeline framework. It encompasses generic data processing phases used by Business Information (BI) Systems, either in traditional or big data context.
  • Traditional Systems: They use structured data, relacional schema and databases. 
  • Big Data Systems: They use structured, simi-structured and/or unstructured data; relational and/or non-relational schemas; SQL and/or NoSQL data storages, and databases servers . 
These BI Systems uses data modeling process and tools for both, on premisse and cloud environments.
The fundamentals for data modeling used are related to the data storage. 

A traditional system uses relational data model or SQL model, either for OLTP (transational/operational) or OLAP (analytical) systems.  The data schema contains tables and their relationsips that will store data.

On the other hand, a big data system uses non-relational data model or NoSQL model for analytical systems. The data schema and storage will follow the data types used and access requirements.  

Comments

  1. Poucas vezes vi um tutorial tão completo e caprichado como esse. Parabéns Professora Emília.

    ReplyDelete

Post a Comment