Alluxio

Alluxio
Original author(s)	Haoyuan Li
Developer(s)	UC Berkeley AMPLab
Initial release	April 8, 2013; 11 years ago
Stable release	v2.9.3 / March 24, 2023; 20 months ago
Repository	https://github.com/Alluxio/alluxio
Written in	Java
Operating system	macOS, Linux
Available in	Java
License	Apache License 2.0
Website	www.alluxio.io

Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis,^[2] advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio sits between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License.

Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs (such as Hadoop HDFS API, S3 API, FUSE API) provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Apache Spark, Presto, TensorFlow, Trino, Apache Hive, and PyTorch, etc.

Alluxio can be deployed on-premise, in the cloud (e.g. Microsoft Azure, AWS, Google Compute Engine), or a hybrid cloud environment. It can run on bare-metal or in a containerized environments such as Kubernetes, Docker, Apache Mesos.

^ "Releases · Alluxio/alluxio". github.com. Retrieved 2023-03-04.
^ Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.

[1] "Releases · Alluxio/alluxio". github.com. Retrieved 2023-03-04.

[VDFS-2] Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.

[1]

[2]