n8n RAG Pipelines Send Plain-Text Internal Docs to OpenAI, Exposing PII
Retrieval-Augmented Generation (RAG) is widely promoted as a secure way to connect corporate data to large language models, but a critical vulnerability exists in how n8n workflows handle retrieved content. Once document chunks are pulled from a vector database such as Pinecone or Qdrant, they are appended to prompts and transmitted in plain text to third-party APIs like OpenAI or Anthropic. This means sensitive data including customer names, tax IDs, financial projections, and HR records can leave an organization's infrastructure entirely unprotected. Compounding the risk, n8n stores full execution history by default, meaning raw retrieved context is readable by anyone with instance access. A proposed mitigation involves tokenizing sensitive context before it reaches the LLM node and reversing that tokenization before the response is shown to the user.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in