Apache Pinot

Apache Pinot
Original author(s)
  • Kishore Gopalakrishna
  • Xiang Fu
Developer(s)Apache Pinot
Stable release
1.2.0 / 21 August 2024; 2 months ago (2024-08-21)
RepositoryPinot repository
Written inJava
Operating systemCross-platform
Type
LicenseApache License 2.0
Websitepinot.apache.org

Apache Pinot is a column-oriented, open-source, distributed data store written in Java. Pinot is designed to execute OLAP queries with low latency.[1][2][3][4][5] It is suited in contexts where fast analytics, such as aggregations, are needed on immutable data, possibly, with real-time data ingestion.[6][7][8] The name Pinot comes from the Pinot grape vines that are pressed into liquid that is used to produce a variety of different wines. The founders of the database chose the name as a metaphor for analyzing vast quantities of data from a variety of different file formats or streaming data sources.[9]

Pinot was first created at LinkedIn after the engineering staff determined that there were no off the shelf solutions that met the social networking site's requirements like predictable low latency, data freshness in seconds, fault tolerance and scalability.[9][10] Pinot is used in production by technology companies such as Uber,[11] Microsoft,[8] and Factual.

  1. ^ Cui, Tingting; Peng, Lijun; Pardoe, David; Liu, Kun; Agarwal, Deepak; Kumar, Deepak (14 August 2017). "Data-Driven Reserve Prices for Social Advertising Auctions at LinkedIn". Proceedings of the ADKDD'17. Association for Computing Machinery. pp. 1–7. doi:10.1145/3124749.3124759. ISBN 9781450351942. S2CID 12327343.
  2. ^ Rosa, Marcello La (2021). ADVANCED INFORMATION SYSTEMS ENGINEERING: 33rd International Conference. Springer Nature. ISBN 978-3-030-79382-1.
  3. ^ Chin, Francis Y. L.; Chen, C. L. Philip; Khan, Latifur; Lee, Kisung; Zhang, Liang-Jie (20 June 2018). Big Data – BigData 2018: 7th International Congress, Held as Part of the Services Conference Federation, SCF 2018, Seattle, WA, USA, June 25–30, 2018, Proceedings. Springer. p. 153. ISBN 978-3-319-94301-5.
  4. ^ Im, Jean-François; Gopalakrishna, Kishore; Subramaniam, Subbu; Shrivastava, Mayank; Tumbde, Adwait; Jiang, Xiaotian; Dai, Jennifer; Lee, Seunghyun; Pawar, Neha; Li, Jialiang; Aringunram, Ravi (2018-05-27). "Pinot: Realtime OLAP for 530 Million Users". Proceedings of the 2018 International Conference on Management of Data. Sigmod '18. Association for Computing Machinery. pp. 583–594. doi:10.1145/3183713.3190661. ISBN 9781450347037. S2CID 44083085.
  5. ^ "The Apache Software Foundation Announces Apache® Pinot™ as a Top-Level Project". blogs.apache.org. 2 August 2021.
  6. ^ Rogers, Ryan; Subramaniam, Subbu; Peng, Sean; Durfee, David; Lee, Seunghyun; Kancha, Santosh Kumar; Sahay, Shraddha; Ahammad, Parvez (16 November 2020). "LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale". arXiv:2002.05839 [cs.CR].
  7. ^ Javadi, Seyyed Ahmad; Gupta, Harsh; Manhas, Robin; Sahu, Shweta; Gandhi, Anshul (July 2018). "EASY: Efficient Segment Assignment Strategy for Reducing Tail Latencies in Pinot". 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). pp. 1432–1437. doi:10.1109/ICDCS.2018.00144. ISBN 978-1-5386-6871-9. S2CID 21659844.
  8. ^ a b Pawar, Neha. "Pinot Joins Apache Incubator" Archived 2019-04-02 at the Wayback Machine, LinkedIn Engineering, 01 April 2019
  9. ^ a b Gopalakrishna, Kishore. "Open Sourcing Pinot: Scaling the Wall of Real-Time Analytics". engineering.linkedin.com. LinkedIn. Archived from the original on 10 September 2015. Retrieved 3 September 2020.
  10. ^ Yegulalp, Serdar (2015-06-11). "LinkedIn fills another SQL-on-Hadoop niche". InfoWorld.
  11. ^ Fu, Yupeng; Soman, Chinmay (9 June 2021). "Real-time Data Infrastructure at Uber". Proceedings of the 2021 International Conference on Management of Data. Sigmod/Pods '21. Association for Computing Machinery. pp. 2503–2516. arXiv:2104.00087. doi:10.1145/3448016.3457552. ISBN 9781450383431. S2CID 232478317.