Original author(s) | Matei Zaharia |
---|---|
Developer(s) | Apache Spark |
Initial release | May 26, 2014 |
Stable release | 3.5.2 (Scala 2.13)
/ August 10, 2024 |
Repository | Spark Repository |
Written in | Scala[1] |
Operating system | Microsoft Windows, macOS, Linux |
Available in | Scala, Java, SQL, Python, R, C#, F# |
Type | Data analytics, machine learning algorithms |
License | Apache License 2.0 |
Website | spark |
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.
MLlib in R: SparkR now offers MLlib APIs [..] Python: PySpark now offers many more MLlib algorithms"