Client · Social Services · 2025

Union Document AI Assistant

An AI-powered retrieval-augmented generation (RAG) assistant built for an Ontario social workers union, allowing staff to ask natural-language questions over union agreements, policies, and procedural documents instead of manually searching through lengthy materials.

PythonLangChainFAISSOpenAIRAGVector Search

The problem

The organization relied on large collections of union agreements, policy documents, procedural references, and internal materials that were difficult to navigate efficiently during day-to-day work. Traditional keyword search was often ineffective because users did not always know the exact terminology or document location associated with their question. The goal was to make important information faster and easier to access for frontline social workers and staff.

Key decisions

Semantic retrieval instead of keyword matching

The system used embeddings and vector similarity search so users could ask questions naturally rather than relying on exact document terminology. This significantly improved retrieval quality for conversational, incomplete, or ambiguous queries and made the system more accessible to non-technical users.

Lightweight vector infrastructure with FAISS

FAISS was selected as the vector retrieval layer because it provided fast semantic search without requiring additional external infrastructure. For the scale and deployment requirements of the project, it offered an effective balance between simplicity, performance, and maintainability.

Retrieval quality and chunking strategy

A major focus of the project was improving retrieval quality through chunking strategy, document preprocessing, and context assembly rather than relying solely on the language model itself. In practice, the usefulness of a RAG system is often determined more by retrieval architecture than by model selection alone.

Grounded responses with source attribution

The assistant was designed to provide responses grounded in retrieved source material rather than generating unsupported answers. Retrieved passages and references were surfaced alongside responses to improve transparency and help users verify important policy or agreement details when needed.

Outcome

The system successfully allowed staff to ask questions over complex union and policy documents using natural language while receiving grounded responses tied to relevant source material. It reduced the time required to locate important information and improved accessibility to dense procedural content for workers operating under real-world time constraints.

What I learned

Building this system reinforced that effective RAG applications are fundamentally information retrieval and organization problems as much as they are AI problems. The quality of chunking, retrieval logic, and context assembly had a larger impact on usability than simply choosing a larger model.