Object Storage with MinIO
Original files are stored in object storage before indexing. The local implementation is RAG.Core/Services/S3ObjectStorage.cs.
Object keys are rooted by document ID
var objectKey = $"{documentId:N}/{fileName}";
await storage.UploadAsync(objectKey, stream, contentType, cancellationToken);Original files are stored in object storage before indexing. The local implementation is RAG.Core/Services/S3ObjectStorage.cs.
MinIO is used because it is S3-compatible and easy to run locally in Docker. The same abstraction could later support:
- AWS S3;
- Azure Blob Storage;
- Google Cloud Storage;
- local filesystem storage.
The object key uses the document ID:
{documentId}/{originalFileName}
This avoids filename collisions and makes it easy to trace a stored object back to the metadata row.
Keeping original files matters because vector chunks are derived data. If chunking, extraction, embedding, or analysis changes later, the worker can reprocess the original document.
The storage abstraction now includes DeleteAsync as well as upload and read operations. That lets document deletion clean up the original object together with metadata and vector points, which is the minimum lifecycle support a real document RAG system needs.