Prompting and Citations
How chat providers receive evidence and return answers with inspectable citation records.
The API returns answer plus citation records
public sealed record AskResponse(
string Answer,
IReadOnlyList<CitationDto> Citations,
RetrievalDiagnostics? Diagnostics = null);
public sealed record CitationDto(
Guid DocumentId,
string FileName,
int ChunkIndex,
int? PageNumber,
double Score,
string Snippet,
string ChunkType = "source",
string? Title = null,
bool IsGeneratedArtifact = false,
string? ArtifactKind = null);The chat prompt shape is shared by the Gemini and Ollama providers. GeminiChatCompletionProvider and OllamaChatCompletionProvider both send a system message plus a user message that contains numbered evidence excerpts.
The system prompt tells the model to:
- answer directly;
- use only provided excerpts;
- prefer literary artifacts for broad literary questions;
- avoid dumping long passages;
- distinguish evidence from interpretation;
- cite excerpts inline with bracket numbers;
- say what is missing when evidence is insufficient.
The user prompt lists numbered excerpts:
[1] File name, page, chunk
chunk text...
[2] File name, page, chunk
chunk text...
The API returns CitationDto records with:
- document ID;
- file name;
- chunk index;
- page number;
- score;
- snippet;
- chunk type;
- title;
- whether the citation is a generated artifact;
- artifact kind when applicable.
The UI displays these under the answer so the user can inspect where the response came from.
The UI labels citations as Source text, Generated book-club profile, Generated name profile, or a generic generated retrieval aid.
Citations Are Not Proof
A citation in this app means, "this chunk was selected and sent as context." It does not prove the model used the chunk correctly, and it does not prove that a generated artifact is direct source evidence. That distinction is why citation DTOs expose ChunkType, Title, IsGeneratedArtifact, and ArtifactKind.
In a stricter production system, citation faithfulness would be evaluated separately: does each answer claim have support in cited source text, did the model cite the right excerpt, and did it avoid treating generated summaries as ground truth?