Date of Award

Fall 11-24-2021

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Shihao Ji

Abstract

Current stereo matching techniques are challenged by restricted searching space, occluded regions and sheer size. While monocular depth estimation is spared from these challenges and can achieve satisfactory results with monocular cues, the lack of stereoscopic relationship renders the monocular prediction less reliable on its own especially in highly dynamic or cluttered environments. To address these issues in both scenarios, an optic-chiasm-inspired self-supervised binocular depth estimation method is proposed in thesis, wherein vision transformer with gated positional cross-attention layer is designed to enable feature-sensitive pattern retrieval between views, while retaining the extensive context information aggregated through self-attentions. This crossover design is biologically analogous to the optic-chasma structure in human visual system and hence the name, ChiTransformer. It leverages strengths of both monocular and binocular approaches. Our experiments show this architecture yields substantial improvements on self-supervised stereo approaches by 15% and can be used on both rectilinear images and fisheye images.

DOI

https://doi.org/10.57709/26632434

Recommended Citation

Su, Qing, "ChiTransformer: Towards Reliable Stereo from Cues." Thesis, Georgia State University, 2021.
doi: https://doi.org/10.57709/26632434

File Upload Confirmation

Download

COinS

Computer Science Theses

ChiTransformer: Towards Reliable Stereo from Cues

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

DOI

Recommended Citation

File Upload Confirmation

Browse

Authors

Computer Science Theses

ChiTransformer: Towards Reliable Stereo from Cues

Author

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

DOI

Recommended Citation

File Upload Confirmation

Share

Browse

Authors