Abstract: Recently, with the development of Large Language Models (LLMs), Embodied AI represented by Vision-Language-Action Models (VLAs) has played a significant role in realizing the natural ...