This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. (July 2024) |
Streaming data is data that is continuously generated by different sources. Such data should be processed incrementally using stream processing techniques without having access to all of the data. In addition, it should be considered that concept drift may happen in the data which means that the properties of the stream may change over time.
It is usually used in the context of big data in which it is generated by many different sources at high speed.[1]
Data streaming can also be explained as a technology used to deliver content to devices over the internet, and it allows users to access the content immediately, rather than having to wait for it to be downloaded.[2] Big data is forcing many organizations to focus on storage costs, which brings interest to data lakes and data streams.[3] A data lake refers to the storage of a large amount of unstructured and semi data, and is useful due to the increase of big data as it can be stored in such a way that firms can dive into the data lake and pull out what they need at the moment they need it,[3] whereas a data stream can perform real-time analysis on streaming data, and it differs from data lakes in speed and continuous nature of analysis, without having to store the data first.[3]