AI accelerator

An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator^[1] or computer system^[2]^[3] designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks.^[4] They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024^[update], a typical AI integrated circuit chip contains tens of billions of MOSFETs.^[5]

AI accelerators are used in mobile devices such as Apple iPhones and Huawei cellphones,^[6] and personal computers such as Intel laptops,^[7] AMD laptops^[8] and Apple silicon Macs.^[9] Accelerators are used in cloud computing servers, including tensor processing units (TPU) in Google Cloud Platform^[10] and Trainium and Inferentia chips in Amazon Web Services.^[11] A number of vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design.

Graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware, and are commonly used as AI accelerators, both for training and inference.^[12]