Query optimization

Query optimization is a feature of many relational database management systems and other databases such as NoSQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans.[1]

Generally, the query optimizer cannot be accessed directly by users: once queries are submitted to the database server, and parsed by the parser, they are then passed to the query optimizer where optimization occurs.[2][3] However, some database engines allow guiding the query optimizer with hints.

A query is a request for information from a database. It can be as simple as "find the address of a person with Social Security number 123-45-6789," or more complex like "find the average salary of all the employed married men in California between the ages 30 to 39 who earn less than their spouses." The result of a query is generated by processing the rows in a database in a way that yields the requested information. Since database structures are complex, in most cases, and especially for not-very-simple queries, the needed data for a query can be collected from a database by accessing it in different ways, through different data-structures, and in different orders.[4] Each different way typically requires different processing time. Processing times of the same query may have large variance, from a fraction of a second to hours, depending on the chosen method. The purpose of query optimization, which is an automated process, is to find the way to process a given query in minimum time. The large possible variance in time justifies performing query optimization, though finding the exact optimal query plan, among all possibilities, is typically very complex, time-consuming by itself, may be too costly, and often practically impossible. Thus query optimization typically tries to approximate the optimum by comparing several common-sense alternatives to provide in a reasonable time a "good enough" plan which typically does not deviate much from the best possible result.

  1. ^ "IBM Knowledge Center". www.ibm.com.
  2. ^ Cite error: The named reference ACM1996 was invoked but never defined (see the help page).
  3. ^ Cite error: The named reference ACM1998 was invoked but never defined (see the help page).
  4. ^ Cite error: The named reference ACM1979 was invoked but never defined (see the help page).