Apache Hadoop

Apache Hadoop
Original author(s)Doug Cutting, Mike Cafarella
Developer(s)Apache Software Foundation
Initial releaseApril 1, 2006; 18 years ago (2006-04-01)[1]
Stable release
2.10.x2.10.2 / May 31, 2022; 2 years ago (2022-05-31)[2]
3.4.x3.4.0 / March 17, 2024; 7 months ago (2024-03-17)[2]
RepositoryHadoop Repository
Written inJava
Operating systemCross-platform
TypeDistributed file system
LicenseApache License 2.0
Websitehadoop.apache.org Edit this at Wikidata

Apache Hadoop ( /həˈdp/) is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation.[vague] It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use.[3] It has since also found use on clusters of higher-end hardware.[4][5] All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.[6]

  1. ^ "Hadoop Releases". apache.org. Apache Software Foundation. Retrieved 28 April 2019.
  2. ^ a b "Apache Hadoop". Retrieved 27 September 2022.
  3. ^ Judge, Peter (22 October 2012). "Doug Cutting: Big Data Is No Bubble". silicon.co.uk. Retrieved 11 March 2018.
  4. ^ Woodie, Alex (12 May 2014). "Why Hadoop on IBM Power". datanami.com. Datanami. Retrieved 11 March 2018.
  5. ^ Hemsoth, Nicole (15 October 2014). "Cray Launches Hadoop into HPC Airspace". hpcwire.com. Retrieved 11 March 2018.
  6. ^ "Welcome to Apache Hadoop!". hadoop.apache.org. Retrieved 25 August 2016.