Data orientation refers to how tabular data is represented in a linear memory model such as in-disk or in-memory.The two most common representations are column-oriented (columnar format) and row-oriented (row format).[1][2]
The choice of data orientation is a trade-off and an architectural decision in databases, query engines, and numerical simulations.[1] As a result of these tradeoffs, row-oriented formats are more commonly used in Online transaction processing (OLTP) and column-oriented formats are more commonly used in Online analytical processing (OLAP).[2]
Examples of column-oriented formats include Apache ORC,[3] Apache Parquet,[4] Apache Arrow,[5] formats used by BigQuery, Amazon Redshift and Snowflake. Predominant examples of row-oriented formats include CSV, formats used in most relational databases, in-memory format of Apache Spark, and Apache Avro.[6]