Apache Hadoop

Apache Hadoop ( /həˈduːp/) is a collection of open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use.^[3] It has since also found use on clusters of higher-end hardware.^[4]^[5] All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.^[6]

^ "Hadoop Releases". apache.org. Apache Software Foundation. Retrieved 28 April 2019.
^ ^a ^b "Apache Hadoop". Retrieved 27 September 2022.
^ Judge, Peter (22 October 2012). "Doug Cutting: Big Data Is No Bubble". silicon.co.uk. Retrieved 11 March 2018.
^ Woodie, Alex (12 May 2014). "Why Hadoop on IBM Power". datanami.com. Datanami. Retrieved 11 March 2018.
^ Hemsoth, Nicole (15 October 2014). "Cray Launches Hadoop into HPC Airspace". hpcwire.com. Retrieved 11 March 2018.
^ "Welcome to Apache Hadoop!". hadoop.apache.org. Retrieved 25 August 2016.