Data curation

Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data so that the value of the data is maintained over time, and the data remains available for reuse and preservation. Data curation includes "all the processes needed for principled and controlled data creation, maintenance, and management, together with the capacity to add value to data".[1] In science, data curation may indicate the process of extraction of important information from scientific texts, such as research articles by experts, to be converted into an electronic format, such as an entry of a biological database.[2]

In the modern era of big data, the curation of data has become more prominent, particularly for software processing high volume and complex data systems.[3] The term is also used within the humanities,[4] where increasing cultural and scholarly data from digital humanities projects requires the expertise and analytical practices of data curation.[5] In broad terms, curation means a range of activities and processes done to create, manage, maintain, and validate a component.[6] Specifically, data curation is the attempt to determine what information is worth saving and for how long.[7]

  1. ^ Renée J. Miller, “Big Data Curation” in 20th International Conference on Management of Data (COMAD) 2014, Hyderabad, India, December 17–19, 2014
  2. ^ Bio creative Glossary. Retrieved on 3 October 2016.
  3. ^ Furht, Borko; Armando Escalante (2011). Handbook of Data Intensive Computing. Springer Science & Business Media. p. 32. ISBN 9781461414155. Retrieved 2 October 2016.
  4. ^ Sabharwal, Arjun (2015). Digital Curation in the Digital Humanities: Preserving and Promoting Archival and Special Collections. Chandos Publishing. p. 60. ISBN 9780081001783. Retrieved 2 October 2016.
  5. ^ "An Introduction to Humanities Data Curation" by Julia Flanders and Trevor Muñoz http://guide.dhcuration.org/intro/. Not available any more: archive.org
  6. ^ Pilin Glossary. Not available any more: archive.org
  7. ^ Borgman, C (2015). Big data, little data, no data: Scholarship in the networked world. Cambridge, Massachusetts: MIT Press. pp. 13. ISBN 978-0-262-02856-1.