Vector database

A vector database, vector store or vector search engine is a database that can store vectors (fixed-length lists of numbers) along with other data items. Vector databases typically implement one or more Approximate Nearest Neighbor algorithms,[1][2][3] so that one can search the database with a query vector to retrieve the closest matching database records.

Vectors are mathematical representations of data in a high-dimensional space. In this space, each dimension corresponds to a feature of the data, with the number of dimensions ranging from a few hundred to tens of thousands, depending on the complexity of the data being represented. A vector's position in this space represents its characteristics. Words, phrases, or entire documents, as well as images, audio, and other types of data, can all be vectorized.[4]

These feature vectors may be computed from the raw data using machine learning methods such as feature extraction algorithms, word embeddings[5] or deep learning networks. The goal is that semantically similar data items receive feature vectors close to each other.

Vector databases can be used for similarity search, semantic search, multi-modal search, recommendations engines, large language models (LLMs), object detection, etc.[6]

Vector databases are also often used to implement retrieval-augmented generation (RAG), a method to improve domain-specific responses of large language models. The retrieval component of a RAG can be any search system, but is most often implemented as a vector database. Text documents describing the domain of interest are collected, and for each document or document section, a feature vector (known as an "embedding") is computed, typically using a deep learning network, and stored in a vector database. Given a user prompt, the feature vector of the prompt is computed, and the database is queried to retrieve the most relevant documents. These are then automatically added into the context window of the large language model, and the large language model proceeds to create a response to the prompt given this context.[7]

  1. ^ Roie Schwaber-Cohen. "What is a Vector Database & How Does it Work". Pinecone. Retrieved 18 November 2023.
  2. ^ "What is a vector database". Elastic. Retrieved 18 November 2023.
  3. ^ "What is a Vector Database?". Retrieved 10 July 2023.
  4. ^ "Vector database". learn.microsoft.com. 2023-12-26. Retrieved 2024-01-11.
  5. ^ Evan Chaki (2023-07-31). "What is a vector database?". Microsoft. A vector database is a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes.
  6. ^ "Vector database". learn.microsoft.com. 2023-12-26. Retrieved 2024-01-11.
  7. ^ Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich (2020). "Retrieval-augmented generation for knowledge-intensive NLP tasks". Advances in Neural Information Processing Systems 33: 9459–9474. arXiv:2005.11401.