Workflow for completing natural-language request with metric-semantic representation of environment

  • Nguyen Van Hung Institute of Automation, Academy of Military Science and Technology
  • Truong Xuan Tung Le Quy Don Technical University
  • Le Viet Hong Institute of Automation, Academy of Military Science and Technology
  • Le Khanh Thanh Institute of Automation, Academy of Military Science and Technology
Keywords: Natural-language request; Path planning; Task planning; Metric-semantic map; 3D scene graph.

Abstract

In mobile robotics and autonomous systems, a natural-language request can be completed by converting it into high-level and low-level tasks. To accomplish such a request, both these types of tasks must be implemented, along with an efficient method to bridge them. However, this problem is still open. This work presents a two-phase workflow (figure 1), including Comprehension and Implementation, based on a metric-semantic map to address this problem. In the Comprehension phase, also known as automated planning, the natural language request is converted into actionable plans using semantic information from the map. These plans are then passed to the Implementation phase, where tasks like navigation or manipulation are executed utilizing geometric information from the map. Moreover, we also conduct an experiment to illustrate how a natural-language request is implemented on a specific metric-semantic presentation of the environment, namely a 3D Scene Graph, with the following complete sequence: from creating the 3D Scene graph until getting the feasible output path. In addition, this work highlights limitations that need to be addressed in the future to enhance the proposed workflow.

điểm /   đánh giá
Published
2025-04-15
Section
Electronics & Automation