Visual Odometry on Nasal-Endoscopic Videos from Head and Neck Oncology

MSc assignment

Background

Head and neck cancer (HNC) is the sixth most common type of cancer worldwide, according to the 2022 cancer statistics [1]. This cancer type includes all malignancies that occur in the oral cavity, the larynx and the pharynx [2,3].

The treatment of HNC patients frequently leads to permanent changes to their physiology and anatomy, often paired with permanent loss of swallowing and speech functions. The extreme patient burden that HNC patients endure severely affects their quality of life [4].

Flexible naso-endoscopy is the front-line visual examination for suspected tumours of the upper aerodigestive tract and for surveillance after treatment, providing direct visualisation and biopsy access to the nasal cavity, pharynx and larynx.

Aim

Nasal-endoscopic navigation traverses a short physical distance (≈25 cm from pharynx into the oesophagus), suggesting that loop closure and full SLAM may be unnecessary if drift remains negligible over this range. However, the head-and-neck region is visually hostile: specular highlights, deforming soft tissue (epiglottis/vocal folds), varying illumination, and locally texture-poor mucosa.

This project tests whether lightweight visual odometry (VO) can track robustly and reconstruct locally consistent geometry over short segments, and, if not, whether adding global optimisation (pose-graph + BA) materially improves outcomes.

Research questions

  1. RQ1 (Feasibility): Can monocular (or stereo, if available) VO maintain stable tracking over short nasal-to-oesophageal traversals without loop closure?
    • H1: Over ≲25 cm, VO drift stays below clinically tolerable thresholds for guidance (to be defined), making full SLAM unnecessary.
  2. RQ2 (Failure Modes): Which scene factors (specularities, tissue deformation near vocal folds/epiglottis, low texture) most degrade VO?
  3. RQ3 (Value of Global Optimisation): If VO performance is inadequate, does adding pose-graph optimisation and global bundle adjustment significantly reduce drift and mapping error?

Expected outcomes

  1. A clear “go/no-go” on VO-only for short-range endoscopic navigation, with quantified drift.
  2. A ranked list of degraders (specularities > deformation > low texture, etc.) with evidence.
  3. If warranted, a demonstrated improvement from global optimisation, or a principled argument why it still fails (e.g., non-rigid loops, perceptual aliasing).

Deliverables

  1. Code repository (readme, environment, scripts to reproduce figures/tables).
  2. Short demo video showing tracking overlays in each stratum (A/B/C).
  3. Report covering methods, metrics, results, and recommendations for clinical feasibility.
  4. Appendix: Calibration notes, parameter settings, failure case gallery.

References

[1] Bray, Freddie, et al. "Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries." CA: a cancer journal for clinicians 74.3 (2024): 229-263
[2] Rettig, Eleni M., and Gypsyamber D’Souza. "Epidemiology of head and neck cancer." Surgical oncology clinics 24.3 (2015): 379-396.
[3] Gormley, Mark, et al. "Reviewing the epidemiology of head and neck cancer: definitions, trends and risk factors." British Dental Journal 233.9 (2022): 780-786.
[4] Howren, M. Bryant, et al. "Psychological factors associated with head and neck cancer treatment and survivorship: evidence and opportunities for behavioral medicine." Journal of consulting and clinical psychology 81.2 (2013): 299.