PixVOD: Pixel-Distributed Direct Visual Odometry and Depth Estimation
2026-06-02 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors suggest a new way for camera sensors to do some of the calculations themselves, directly within each pixel, instead of sending all raw image data elsewhere. They use a method called Gaussian Belief Propagation to help pixels work together to figure out the camera's movement and the distance to objects. To keep these calculations stable, they add a system like keyframes to anchor the process. Their tests show that this approach can work for estimating motion and depth right on the sensor, potentially making vision tasks more efficient.
2D pixel arraysGaussian Belief Propagationvisual odometrydepth estimationsensor-processorskeyframe anchoringphotometric observationssurface normal priordistributed computationcamera motion
Authors
Shinjeong Kim, Ignacio Alzugaray, Callum Rhodes, Paul H. J. Kelly, Andrew J. Davison
Abstract
Images composed of 2D pixel arrays are the standard input to computer vision algorithms, yet many underlying computations can be distributed across pixels. Transmitting raw, redundant, and noisy pixel data off the sensor remains inefficient, motivating a shift toward focal-plane sensor-processors that perform a significant part of the computation directly within each pixel. We envision pixels synthesizing higher-level signals locally, reducing downstream load, and providing richer inputs for higher-level vision tasks. We propose a fully parallelizable form of visual odometry and depth estimation across pixels, where sensor-processors exchange information through Gaussian Belief Propagation (GBP) to achieve consensus about camera motion and infer depth from per-pixel photometric observations and a surface normal prior. To maintain geometric stability during optimization, we introduce a keyframe-like anchoring mechanism that regulates the effective baseline between frames, enabling consistent motion and depth updates. Our method is evaluated on realistic datasets, demonstrating the feasibility of GBP-based pixel-level distributed odometry and depth estimation with keyframe anchoring on-sensor. Project Page: https://www.shinjeongkim.com/pixvod/