Implementing Retrieval-Augmented Generation for Academic Libraries: A Technical Case Study using Azure AI

Wei Xuan

doi:10.23974/ijol.2026.vol11.1.601

Vol. 11 No. 1 (2026), Reports From the Field

Vol. 11 No. 1 (2026)

Implementing Retrieval-Augmented Generation for Academic Libraries: A Technical Case Study using Azure AI

Reports From the Field

https://doi.org/10.23974/ijol.2026.vol11.1.601

Published 2026-03-31

Wei Xuan⁺⁻

Wei Xuan

University of Manitoba

https://orcid.org/0000-0002-3795-9640

PDF

How to Cite

Xuan, W. (2026). Implementing Retrieval-Augmented Generation for Academic Libraries: A Technical Case Study using Azure AI. International Journal of Librarianship, 11(1), 151–166. https://doi.org/10.23974/ijol.2026.vol11.1.601

Abstract

This article details the technical development of a Retrieval-Augmented Generation (RAG) system designed to enhance discovery within an academic library's institutional repository. Conducted during a six-month research leave in 2025, this project explores the practical application of emerging cloud-based AI tools in a library context. We developed a prototype that integrates the University of Manitoba’s MSpace repository with Microsoft Azure AI services. The system utilizes an OAI-PMH harvester to retrieve metadata, generates semantic vector embeddings via the text-embedding-ada-002 model, and indexes these vectors in Azure AI Search. A custom front-end application facilitates both traditional keyword search and generative, context-aware chat interactions. This paper documents the development environment, script logic, and specific technical challenges overcome—such as OAI-PMH pagination errors and API versioning conflicts—providing a reproducible roadmap for libraries seeking to explore semantic search technologies.

https://doi.org/10.23974/ijol.2026.vol11.1.601

PDF

References

Alshammari, S., Basalelah, L., Walaa Abu Rukbah, Alsuhibani, A., & Wijesinghe, D. S. (2024). PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation. arXiv.Org. https://doi.org/10.48550/arXiv.2405.07963

Burtsev, M., Reeves, M., & Job, A. (2024). The Working Limitations of Large Language Models. MIT Sloan Management Review, 65(2), 8–10.

Columbia University. (2024, May 24). Enhancing library search system with AI technology at Columbia University. https://etc.cuit.columbia.edu/news/AICoP-library-augment-discovery-with-AI

Finkelstein, J., Moskovitch, R., & Parimbelli, E. (2024). Artificial Intelligence in Medicine: 22nd International Conference, AIME 2024, Salt Lake City, UT, USA, July 9-12, 2024, Proceedings, Part I (2024th edition, Vol. 14844). Springer.

Huang, Y., & Huang, J. (2024). A survey on retrieval-augmented text generation for large language models. arXiv.Org. http://arxiv.org/abs/2404.10981

Kamath, U., Keenan, K., Somers, G., & Sorenson, S. (2024). Large language models : a deep dive : bridging theory and practice (2024th edition). Springer. https://doi.org/10.1007/978-3-031-65647-7

Kaplinsky, P., Singh, R., Fusillo, T. F., Leader, A., Zwicker, J. I., & Mantha, S. (2024). Retrieval augmented generation for the detection of major bleeding events in the electronic health record. Blood, 144(Supplement 1), 2263. https://doi.org/10.1182/blood-2024-203911

Kassorla, M., Georgieva, M., & Papini, A. (2024). AI literacy in teaching and learning: A durable framework for higher education. Educause. https://www.educause.edu/content/2024/ai-literacy-in-teaching-and-learning/introduction

Kautonen, H., & Gasparini, A. A. (2024). B-Wheel – Building AI competences in academic libraries. The Journal of Academic Librarianship, 50(4), Article 102886. https://doi.org/10.1016/j.acalib.2024.102886

Meakin, L. (2024). Exploring the Impact of Generative Artificial Intelligence on Higher Education Students’ Utilization of Library Resources: A Critical Examination. Information Technology and Libraries, 43(3), Article 17246. https://doi.org/10.5860/ital.v43i3.17246

Ni, Z., Qian, Y., Chen, S., Jaulent, M.-C., & Bousquet, C. (2024). Scientific evidence and specific context: leveraging large language models for health fact-checking. Online Information Review, 48(7), 1488–1514. https://doi.org/10.1108/OIR-02-2024-0111

Pride, D., Cancellieri, M., & Knoth, P. (2023). CORE-GPT: Combining open access research and large language models for credible, trustworthy question answering. arXiv.Org http://arxiv.org/abs/2307.04683

Rahman, M. H., & Islam, M. N. (2024). The Impact of ChatGPT for Enhancing Knowledge Management in University Libraries. Journal of Web Librarianship, 18(4), 177–196. https://doi.org/10.1080/19322909.2024.2391907

Toro, S., Anagnostopoulos, A. V., Bello, S., Blumberg, K., Cameron, R., Carmody, L., Diehl, A. D., Dooley, D., Duncan, W., Fey, P., Gaudet, P., Harris, N. L., Joachimiak, M., Kiani, L., Lubiana, T., Munoz-Torres, M. C., O’Neil, S., Osumi-Sutherland, D., Puig, A., … Mungall, C. J. (2024). Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI). arXiv.Org. https://doi.org/10.48550/arxiv.2312.10904

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. arXiv.Org, https://doi.org/10.48550/arXiv.1706.03762

Wang, C., Ong, J., Wang, C., Ong, H., Cheng, R., & Ong, D. (2024). Potential for GPT Technology to Optimize Future Clinical Decision-Making Using Retrieval-Augmented Generation. Annals of Biomedical Engineering, 52(5), 1115–1118. https://doi.org/10.1007/s10439-023-03327-6

Wheatley, A., & Hervieux, S. (2024). Comparing generative artificial intelligence tools to voice assistants using reference interactions. The Journal of Academic Librarianship, 50(5), Article 102942. https://doi.org/10.1016/j.acalib.2024.102942

Yun, L., Yun, S., & Xue, H. (2024). Improving citizen-government interactions with generative artificial intelligence: Novel human-computer interaction strategies for policy understanding through large language models. PloS One, 19(12), Article e0311410. https://doi.org/10.1371/journal.pone.0311410

This work is licensed under a Creative Commons Attribution 4.0 International License.

Downloads

Download data is not yet available.

Implementing Retrieval-Augmented Generation for Academic Libraries: A Technical Case Study using Azure AI

How to Cite

Download Citation

Abstract

References

Downloads