Same Weights, Different Robot: A Deployment Safety View of VLA Policies
2026-06-02 • Cryptography and Security
Cryptography and Security
AI summaryⓘ
The authors explain that in robot control systems combining vision, language, and actions, simply having the same model checkpoint does not guarantee the same behavior when the robot actually acts. This is because additional factors like how actions are transformed and interpreted by the robot can cause differences not visible in the model alone. They identify this issue as a 'deployment-safety gap' and propose including action normalization details as part of the robot policy definition. Their experiments show that changing this metadata can drastically reduce task success, highlighting that metadata must be checked to ensure safe and consistent robot behavior.
vision-language-action (VLA) policiescheckpointaction normalizationdeployment-safety gapexecutable policyquantile normalizationmetadatarobot controllerpolicy rolloutsemantic drift
Authors
Jianwei Tai
Abstract
Vision-language-action (VLA) policies are often treated as checkpoint-defined objects: if the weights, prompt, and benchmark suite match, the deployment is assumed to be the same policy. Robot execution breaks this assumption because the same normalized model output can become a different physical action after action unnormalization and controller conventions are applied. This creates a deployment-safety gap: safety review can certify the checkpoint while missing the executable robot policy that reaches the controller. We formalize this gap as an executable policy specification problem: a VLA policy includes the learned model, action representation, metadata-selected unnormalizer, and controller-facing conventions. Under this view, identical checkpoints can be executable-inequivalent. For quantile-style action normalization, we derive a closed-form metadata mismatch transform and an ExecSpec certificate that measures action-space semantic drift without model inference or rollout. On LIBERO-Goal replay, substituting a plausible sibling metadata key yields mean drift 0.199 over six non-gripper action dimensions and reduces success from 28/28 to 2/28 under full substitution. On LIBERO-Spatial replay, the same substituted key reduces success from 26/26 to 0/26. The same full-substitution protocol gives 0/28 success for all four Object substitutions and 0/23 or 1/23 success on Long. Identity-key, replay-validity, no-op filtering, raw-vs-correct replay, mask/gripper, synthetic upper-bound, and OpenVLA-style unnormalizer interface checks rule out several simpler explanations. These results do not certify closed-loop or hardware safety. They support a narrower deployment-safety view: action-space metadata is part of the executable policy and should be checked before rollout.