Pose (computer vision)

In the fields of computing and computer vision, pose (or spatial pose) represents the position and orientation of an object, usually in three dimensions.[1] Poses are often stored internally as transformation matrices.[2][3] The term “pose” is largely synonymous with the term “transform”, but a transform may often include scale, whereas pose does not.[4][5]

In computer vision, the pose of an object is often estimated from camera input by the process of pose estimation. This information can then be used, for example, to allow a robot to manipulate an object or to avoid moving into the object based on its perceived position and orientation in the environment. Other applications include skeletal action recognition.

  1. ^ Hoff, William A.; Nguyen, Khoi; Lyon, Torsten (1996-10-29). Casasent, David P. (ed.). "Computer-vision-based registration techniques for augmented reality". Intelligent Robots and Computer Vision XV: Algorithms, Techniques, Active Vision, and Materials Handling. 2904. SPIE: 538–548. Bibcode:1996SPIE.2904..538H. doi:10.1117/12.256311. S2CID 6587175.
  2. ^ "Pose (Position and Orientation)".
  3. ^ "Transformation matrices to geometry_msgs/Pose - ROS Answers: Open Source Q&A Forum".
  4. ^ "Drake: Spatial Pose and Transform".
  5. ^ "Apple Developer Documentation".