Date of Award

5-1-2024

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Jonathan Shihao Ji

Abstract

In this work, we present a novel hierarchical navigation policy for object navigation that leverages both object detection models and large language models (LLMs) to enhance the interpretation and interaction with complex indoor environments. Our approach integrates object detection to accurately assess the surrounding space and employs a layout reconstruction strategy to model the environment’s structure. By defining our navigation strategy hierarchically, we separate the decision-making into long-term and short-term goals, effectively utilizing the existing concept of ”frontier-based goal selection.” We refine this method by representing frontiers through a series of observations transformed into language via object detection models. Each frontier is then scored using LLMs, allowing for a reasoned selection of the most promising navigational targets. Our framework, simple yet effective, not only aligns with the demands of dynamic and unknown environments but also surpasses existing baselines in terms of efficiency and accuracy, offering significant advancements in the field of robotic navigation. Code can be found at https://github.com/weizhenFrank/ObjNav.

DOI

https://doi.org/10.57709/36972831

Recommended Citation

Liu, Weizhen, "Towards Vision and Language Models Aided Object Navigation." Thesis, Georgia State University, 2024.
doi: https://doi.org/10.57709/36972831

File Upload Confirmation

Download

COinS

Computer Science Theses

Towards Vision and Language Models Aided Object Navigation

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

DOI

Recommended Citation

File Upload Confirmation

Browse

Authors

Computer Science Theses

Towards Vision and Language Models Aided Object Navigation

Author

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

DOI

Recommended Citation

File Upload Confirmation

Share

Browse

Authors