Abstract:
The existing embodied learning approaches suffer from limited cross-scene generalization, low exploration efficiency, and local optimal policy planning and other problems. To solve and alleviate these issues, a novel large language model (LLM) based iterative embodied learning paradigm is proposed, which establishes a closed-loop feedback system with the LLM as the core decision-maker, decoupling the complex embodied exploration problem into a synergy of high-level planning and low-level path execution. The proposed framework employs the LLM to generate an environment selection policy for optimizing training environment choice and an exploration policy to guide the agent in exploring environments based on the perception prompt. Through iterative data collection, the perception performance is gradually improved. The experiments demonstrate that the proposed framework significantly enhances the environmental understanding ability of the model.