Chapter 13

Prompting and Citations

How chat providers receive evidence and return answers with inspectable citation records.

The API returns answer plus citation records

public sealed record AskResponse(
    string Answer,
    IReadOnlyList<CitationDto> Citations,
    RetrievalDiagnostics? Diagnostics = null);

public sealed record CitationDto(
    Guid DocumentId,
    string FileName,
    int ChunkIndex,
    int? PageNumber,
    double Score,
    string Snippet,
    string ChunkType = "source",
    string? Title = null,
    bool IsGeneratedArtifact = false,
    string? ArtifactKind = null);

Full source: RAG.Core/Services/GeminiProviders.cs Full source: RAG.Core/Services/OllamaProviders.cs Full source: RAG.Core/Models/RagDtos.cs

The chat prompt shape is shared by the Gemini and Ollama providers. GeminiChatCompletionProvider and OllamaChatCompletionProvider both send a system message plus a user message that contains numbered evidence excerpts.

The system prompt tells the model to:

answer directly;
use only provided excerpts;
prefer literary artifacts for broad literary questions;
avoid dumping long passages;
distinguish evidence from interpretation;
cite excerpts inline with bracket numbers;
say what is missing when evidence is insufficient.

The user prompt lists numbered excerpts:

[1] File name, page, chunk
chunk text...

[2] File name, page, chunk
chunk text...

The API returns CitationDto records with:

document ID;
file name;
chunk index;
page number;
score;
snippet;
chunk type;
title;
whether the citation is a generated artifact;
artifact kind when applicable.

The UI displays these under the answer so the user can inspect where the response came from.

The UI labels citations as Source text, Generated book-club profile, Generated name profile, or a generic generated retrieval aid.

Citations Are Not Proof

A citation in this app means, "this chunk was selected and sent as context." It does not prove the model used the chunk correctly, and it does not prove that a generated artifact is direct source evidence. That distinction is why citation DTOs expose ChunkType, Title, IsGeneratedArtifact, and ArtifactKind.

In a stricter production system, citation faithfulness would be evaluated separately: does each answer claim have support in cited source text, did the model cite the right excerpt, and did it avoid treating generated summaries as ground truth?

PreviousAsk Flow and Retrieval Strategy NextTesting the Pipeline