Enhanced Natural Language Annotation and Query for Semantic Mapping in Visual SLAM Using Large Language Models
Main Article Content
Abstract
Traditional visual Simultaneous Localization and Mapping (SLAM) systems generate geometric representations that lack semantic understanding essential for intuitive human-robot interaction. This research presents a comprehensive framework that enhances visual SLAM through large language model integration, enabling natural language annotation generation and spatial query processing capabilities. The proposed methodology incorporates multimodal feature extraction and fusion mechanisms that combine visual geometric information with semantic understanding to create contextually rich environmental representations. Our system employs attention-weighted concatenation for integrating RGB-D sensor data with transformer-based language processing, generating hierarchical natural language descriptions of spatial environments. The framework processes user queries through natural language understanding modules that extract spatial intent and enable conversational interaction with robotic mapping systems. Experimental evaluation on a comprehensive dataset of 15,000 RGB-D sequences demonstrates substantial performance improvements, achieving 84.7% semantic annotation accuracy and 89.2% query processing success rate compared to traditional approaches. The system maintains competitive geometric accuracy at 0.029m average trajectory error while providing enhanced semantic capabilities. Real-time processing requirements are satisfied with 23.6ms average response time, enabling practical deployment in interactive robotic applications. Ablation studies confirm the necessity of each major component, with large language model integration providing the most significant improvements in semantic quality and query handling capabilities. This research establishes foundations for next-generation language-enabled robotic navigation systems that facilitate intuitive spatial communication between humans and autonomous systems.
Article Details
Section
How to Cite
References
1. S. Zhang, C. Zhu, and J. Xin, “CloudScale: A lightweight AI framework for predictive supply chain risk management in small and medium manufacturing enterprises,” Spectrum of Research, vol. 4, no. 2, 2024.
2. Y. Chen, C. Ni, and H. Wang, “AdaptiveGenBackend: A scalable architecture for low-latency generative AI video processing in content creation platforms,” Annals of Applied Sciences, vol. 5, no. 1, 2024.
3. C. Ju, X. Jiang, J. Wu, and C. Ni, “AI-driven vulnerability assessment and early warning mechanism for semiconductor supply chain resilience,” Annals of Applied Sciences, vol. 5, no. 1, 2024.
4. Z. Wu, S. Wang, C. Ni, and J. Wu, “Adaptive traffic signal timing optimization using deep reinforcement learning in urban networks,” Artificial Intelligence and Machine Learning Review, vol. 5, no. 4, pp. 55–68, 2024, doi: 10.69987/AIMLR.2024.50405.
5. Y. Zhao, P. Zhang, Y. Pu, H. Lei, and X. Zheng, “Unit operation combination and flow distribution scheme of water pump station system based on genetic algorithm,” Applied Sciences, vol. 13, no. 21, p. 11869, 2023, doi: 10.3390/app132111869.
6. A. Kang, J. Xin, and X. Ma, “Anomalous cross-border capital flow patterns and their implications for national economic security: An empirical analysis,” Journal of Advanced Computing Systems, vol. 4, no. 5, pp. 42–54, 2024, doi: 10.69987/JACS.2024.40504.
7. W. Bi, T. K. Trinh, and S. Fan, “Machine learning-based pattern recognition for anti-money laundering in banking systems,” Journal of Advanced Computing Systems, vol. 4, no. 11, pp. 30–41, 2024, doi: 10.69987/JACS.2024.41103.
8. H. Wang, K. Qian, C. Ni, and J. Wu, “Distributed batch processing architecture for cross-platform abuse detection at scale,” Pinnacle Academic Press Proceedings Series, vol. 2, pp. 12–27, 2025.
9. C. Ni, K. Qian, J. Wu, and H. Wang, “Contrastive time-series visualization techniques for enhancing AI model interpretability in financial risk assessment,” 2025.
10. J. Wang, L. Guo, and K. Qian, “LSTM-based heart rate dynamics prediction during aerobic exercise for elderly adults,” 2025, doi: 10.20944/preprints202504.1692.v1.
11. D. Ma, M. Shu, and H. Zhang, “Feature selection optimization for employee retention prediction: A machine learning ap-proach for human resource management,” 2025, doi: 10.20944/preprints202504.1549.v1.
12. M. Li, D. Ma, and Y. Zhang, “Improving database anomaly detection efficiency through sample difficulty estimation,” 2025, doi: 10.20944/preprints202504.1527.v1.
13. K. Yu, Y. Chen, T. K. Trinh, and W. Bi, “Real-time detection of anomalous trading patterns in financial markets using gen-erative adversarial networks,” 2025.
14. B. Dong and T. K. Trinh, “Real-time early warning of trading behavior anomalies in financial markets: An AI-driven ap-proach,” Journal of Economic Theory and Business Management, vol. 2, no. 2, pp. 14–23, 2025, doi: 10.70393/6a6574626d.323838.
15. H. McNichols, M. Zhang, and A. Lan, “Algebra error classification with large language models,” in Proc. Int. Conf. Artificial Intelligence in Education, Cham: Springer Nature Switzerland, Jun. 2023, pp. 365–376. ISBN: 9783031362729.
16. M. Zhang, N. Heffernan, and A. Lan, “Modeling and analyzing scorer preferences in short-answer math questions,” arXiv preprint arXiv:2306.00791, 2023, doi: 10.48550/arXiv.2306.00791.
17. J. Fan, T. K. Trinh, and H. Zhang, “Deep learning-based transfer pricing anomaly detection and risk alert system for phar-maceutical companies: A data security-oriented approach,” Journal of Advanced Computing Systems, vol. 4, no. 2, pp. 1–14, 2024, doi: 10.69987/JACS.2024.40201.
18. M. Zhang, S. Baral, N. Heffernan, and A. Lan, “Automatic short math answer grading via in-context meta-learning,” arXiv preprint arXiv:2205.15219, 2022, doi: 10.48550/arXiv.2205.15219.
19. T. K. Trinh and D. Zhang, “Algorithmic fairness in financial decision-making: Detection and mitigation of bias in credit scoring applications,” Journal of Advanced Computing Systems, vol. 4, no. 2, pp. 36–49, 2024, doi: 10.69987/JACS.2024.40204.
20. Z. Wang, M. Zhang, R. G. Baraniuk, and A. S. Lan, “Scientific formula retrieval via tree embeddings,” in Proc. IEEE Int. Conf. Big Data (Big Data), Dec. 2021, pp. 1493–1503, doi: 10.1109/BigData52589.2021.9671942.
21. M. Zhang, Z. Wang, R. Baraniuk, and A. Lan, “Math operation embeddings for open-ended solution analysis and feedback,” arXiv preprint arXiv:2104.12047, 2021, doi:10.48550/arXiv.2104.12047.
22. B. Dong and T. K. Trinh, “Real-time early warning of trading behavior anomalies in financial markets: An AI-driven ap-proach,” Journal of Economic Theory and Business Management, vol. 2, no. 2, pp. 14–23, 2025, doi: 10.70393/6a6574626d.323838.
23. D. Qi, J. Arfin, M. Zhang, T. Mathew, R. Pless, and B. Juba, “Anomaly explanation using metadata,” in Proc. IEEE Winter Conf. Applications of Computer Vision (WACV), Mar. 2018, pp. 1916–1924, doi: 10.1109/WACV.2018.00212.
24. M. Zhang, T. Mathew, and B. Juba, “An improved algorithm for learning to perform exception-tolerant abduction,” in Proc. AAAI Conf. Artificial Intelligence, vol. 31, no. 1, Feb. 2017, doi: 10.1609/aaai.v31i1.10700.
25. S. Zhang, Z. Feng, and B. Dong, “LAMDA: Low-latency anomaly detection architecture for real-time cross-market financial decision support,” Academia Nexus Journal, vol. 3, no. 2, 2024.
26. T. K. Trinh and Z. Wang, “Dynamic graph neural networks for multi-level financial fraud detection: A temporal-structural approach,” Annals of Applied Sciences, vol. 5, no. 1, 2024.
27. Z. Wang, X. Wang, and H. Wang, “Temporal graph neural networks for money laundering detection in cross-border transactions,” Academia Nexus Journal, vol. 3, no. 2, 2024.