ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
Abstract
Recent progress in vision-language-action models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simulation, yet standard evaluations largely assume instructions are precise and well-formed. We introduce ICR-Drive, a diagnostic framework for instruction counterfactual robustness in end-to-end language-conditioned autonomous driving.
ICR-Drive generates controlled instruction variants spanning four perturbation families: Paraphrase, Ambiguity, Noise, and Misleading. By replaying identical CARLA routes under matched simulator configurations and seeds, we isolate performance changes attributable solely to instruction language.
Experiments on LMDrive and BEVDriver show that even minor instruction changes can induce substantial performance drops and distinct failure modes, revealing a critical reliability gap for deploying embodied foundation models in safety-critical driving.
Counterfactual Instruction Families
Paraphrase: meaning-preserving rewordings that change surface form while keeping the intended maneuver intact.
Ambiguity: underspecified instructions that remove directional, temporal, or distance qualifiers.
Noise: recoverable surface corruptions such as typos, punctuation edits, and casing changes.
Misleading: authority-framed directives that explicitly conflict with the intended navigation goal.
Key Findings
Instruction variations significantly degrade driving performance. On LangAuto-Tiny, goal-preserving perturbations (paraphrase, ambiguity, noise) reduce LMDrive’s driving score by ~14–15 points, while misleading instructions cause catastrophic drops across both LMDrive and BEVDriver. On the full LangAuto benchmark, ambiguity and misleading instructions remain consistently harmful, with route completion as the dominant failure mode.
Driving score degradation under counterfactual instruction variations for LMDrive and BEVDriver.
BibTeX
@inproceedings{hamid2026icrdrive,
title={ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving},
author={Hamid, Kaiser and Cui, Can and Liang, Nade},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026},
url={https://icrdrive.github.io/}
}