Data integration aims at providing an integrated and consistent view of data coming from internal and external data sources. This is achieved in using one of the three different data integration techniques, depending on the heterogeneity, complexity, and volume of data sources involved.
Data Consolidation
As the name suggests, data consolidation is the process of consolidating or combining data from different data sources to create a centralized data repository or data store. This unified data store is then used for various purposes, such as reporting and data analysis. In addition, it can also perform as a data source for downstream applications.
One of the key factors that differentiate data consolidation from other data integration techniques is data latency. Data latency is defined as the amount of time it takes to retrieve data from data sources to transfer to the data store. The shorter the latency period, the fresher data is available in the data store for BI and analysis.
There is usually some level of latency between the time updates occur to the data stored in source systems and the time those updates reflect in the data warehouse or data source. Depending on the data integration technologies used and the specific needs of the business, this latency can be of a few seconds, hours, or more. However, with advancements in data integration technologies, it is possible to consolidate data and transfer changes to the destination in near real-time or real-time.
Data Federation
Data federation is a data integration technique that is used to consolidate data and simplify access for consuming users and front-end applications. In data federation, distributed data with different data models is integrated into a virtual database that features a unified data model.
There is no physical data integration happening behind a federated virtual database. Instead, data abstraction is done to create a uniform user interface for data access and retrieval. As a result, whenever a user or an application queries the federated virtual database, the query is decomposed and sent to the relevant underlying data source. In other words, the data is served on an on-demand basis in data federation, unlike data consolidated in which data is integrated to build a centralized data store.
Data Propagation
Data propagation is another technique for data integration in which data from an enterprise data warehouse is transferred to different data marts after the required transformations. Since the data continues to update in the data warehouse, changes are propagated to the source data mart in a synchronous or asynchronous manner. The two common technologies used for data propagation include enterprise application integration (EAI) and enterprise data replication (EDR). These are discussed below.