Data integration

Data integration involves combining data residing in different sources and providing users with a unified view of them.[1] This process becomes significant in a variety of situations, which include both commercial (such as when two similar companies need to merge their databases) and scientific (combining research results from different bioinformatics repositories, for example) domains. Data integration appears with increasing frequency as the volume, complexity (that is, big data) and the need to share existing data explodes.[2] It has become the focus of extensive theoretical work, and numerous open problems remain unsolved. Data integration encourages collaboration between internal as well as external users. The data being integrated must be received from a heterogeneous database system and transformed to a single coherent data store that provides synchronous data across a network of files for clients.[3] A common use of data integration is in data mining when analyzing and extracting information from existing databases that can be useful for Business information.[4]

  1. ^ Maurizio Lenzerini (2002). "Data Integration: A Theoretical Perspective" (PDF). PODS 2002. pp. 233–246.
  2. ^ Frederick Lane (2006). "IDC: World Created 161 Billion Gigs of Data in 2006". Archived from the original on 2015-07-15.
  3. ^ mikben. "Data Coherency - Win32 apps". docs.microsoft.com. Archived from the original on 2020-06-12. Retrieved 2020-11-23.
  4. ^ Chung, P.; Chung, S. H. (2013-05). "On data integration and data mining for developing business intelligence". 2013 IEEE Long Island Systems, Applications and Technology Conference (LISAT): 1–6. doi:10.1109/LISAT.2013.6578235.