We are currently hitting the 20-year milestone for DITA. For many of us who have lived through the migrations and the governance battles, the current obsession with AI feels like both a threat and a massive opportunity. But the reality I’m seeing on the ground is that most AI initiatives in technical writing will fail because the underlying content architecture isn’t ready for Retrieval-Augmented Generation (RAG).
The Metadata Prerequisite for AI
RAG is how we connect LLMs to our specific “single source of truth.” However, an AI agent is only as good as the metadata and taxonomy you feed it. If your process orchestration doesn’t enforce consistent tagging at the point of authoring, you are effectively feeding your AI a giant bucket of unorganized text.
At SiteFusion ProConsult, we see this as a practitioner’s challenge. We use Fonto to make the authoring experience intuitive, but the heavy lifting is done in the background by MarkLogic and Camunda. Our goal is to ensure that metadata isn’t a chore that writers skip, but a required byproduct of the publishing workflow.
If your taxonomy project has stalled or your metadata is inconsistent, your AI isn’t going to fix it for you—it’s going to hallucinate based on the gaps you left behind. I’ll be at the Pittsburgh Marriott City Center next week for ConVEx. Let’s talk about how to get your DITA sources RAG-ready by fixing the workflow before you turn on the AI.
