GitHub - Extend-Robotics/TA-VLA: CoRL 2025 TA-VLA: Elucidating the Design Space of Torque-aware Vision-Language-Action Models

TA-VLA: Elucidating the Design Space of
Torque-aware Vision-Language-Action Models

CoRL 2025

Zongzheng Zhang^*1 · Haobo Xu^*2 · Zhuo Yang^*1 ·
Chenghao Yue¹ · Zehao Lin¹ · Huan-ang Gao¹· Ziwei Wang³ . Hao Zhao^1,2

¹ Beijing Academy of Artificial Intelligence (BAAI),
² Institute for AI Industry Research (AIR), Tsinghua University,
³ Nanyang Technological University
_{(* indicates equal contribution)}

Project Page | arXiv | Code

TA-VLA

This repository provides the implementation of TA-VLA: Elucidating the Design Space of Torque-aware Vision-Language-Action Models on openpi. It is branched from the original repository at commit cd82848. Please refer to the original repository for environment setup and training details. The following focuses only on the differences from the upstream repository.

Torque Input Types

The file src/openpi/shared/effort_type.py defines the ways in which torque information is fed into openpi, covering all the experiments described in the paper.
The types are:

NO
No effort is used, but TavlaInputs still processes it for norm_stats computation.
Used for the baseline model.
STATE
Inserts the current effort into the last state[-14:] so that it will be considered by the action expert.
Corresponds to the DePre method in Section 4.1.
LLM
Projects the current effort into a token and passes it to the LLM along with image and language tokens.
Projector MLP: Linear(in, 2*w) -> swish -> Linear(2*w, w).
Corresponds to the Enc method in Section 4.1.
LLM_HIS_C
Concatenates current and historical effort, projects it into a token, and passes it to the LLM.
Corresponds to Enc-1 in Section 4.2.
LLM_HIS_T
Projects current and historical effort into tokens separately and passes them to the LLM.
Corresponds to Enc-H in Section 4.2.
EXPERT
Projects effort into a token and passes it to the action expert (a component of the LLM) together with state and action tokens.
Corresponds to DePost in Section 4.1.
EXPERT_HIS_C
Concatenates current and historical effort, projects it into a token, and passes it to the action expert.
Corresponds to Dec-1 in Section 4.2.
EXPERT_HIS_T
Projects current and historical effort into tokens separately and passes them to the action expert.
Corresponds to Dec-H in Section 4.2.
EXPERT_FUT
Not an input type per se, but predicts future effort along with actions.
Corresponds to Sections 5 and 6 ($π_0$ + obj).
EXPERT_HIS_C_FUT
Inputs concatenated historical effort to the action expert and outputs future effort.
Corresponds to Sections 5 and 6 ($π_0$ + obs + obj).
EXPERT_HIS_C_L_FUT
Inputs concatenated historical effort as the last token and outputs future effort.
A positional variant of the previous type, tested without performance improvement.

Note: These torque-handling implementations have only been tested on $π_0$ and may not be compatible with $π_0$-FAST.

Dataset

As in the original openpi implementation, we use datasets in the standard lerobot format. The difference is that we expect an additional field observation.effort storing the per-frame joint torque, analogous to how observation.state stores per-frame joint angles. We provide a sample dataset for the button-pressing task here.

Training

Refer to the example configurations provided in src/openpi/training/config.py. When using effort inputs, be sure to pass the corresponding effort_history parameter.

Deployment

For data collection and model deployment, we use a modified version of the AgileX official example code. In addition to reading torque values from the ROS topic, this version maintains a historical torque buffer for policies that require past torque information.

Citation

If you find this project useful, feel free to cite our work!

@article{zhang2025ta,
  title={TA-VLA: Elucidating the Design Space of Torque-aware Vision-Language-Action Models},
  author={Zhang, Zongzheng and Xu, Haobo and Yang, Zhuo and Yue, Chenghao and Lin, Zehao and Gao, Huan-ang and Wang, Ziwei and Zhao, Hao},
  journal={arXiv preprint arXiv:2509.07962},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
docs		docs
examples		examples
packages/openpi-client		packages/openpi-client
scripts		scripts
src/openpi		src/openpi
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TA-VLA: Elucidating the Design Space of
Torque-aware Vision-Language-Action Models

TA-VLA

Torque Input Types

Dataset

Training

Deployment

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TA-VLA: Elucidating the Design Space of Torque-aware Vision-Language-Action Models

TA-VLA

Torque Input Types

Dataset

Training

Deployment

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

TA-VLA: Elucidating the Design Space of
Torque-aware Vision-Language-Action Models

Packages