Privacy note: Documents are stored on the API host. When configured with hosted providers (e.g. OpenAI), document chunks and tribunal prompts are sent to third-party APIs. Use Ollama or local embeddings for fully private inference.

Architecture

System Overview

User → Next.js Frontend → FastAPI Backend
                              ├── Document Ingestion (extract, chunk, embed)
                              ├── ChromaDB Vector Store
                              ├── Hybrid Retrieval (vector + BM25)
                              └── Tribunal Pipeline
                                    ├── Witness (grounded answer)
                                    ├── Claim Extraction
                                    ├── Prosecutor (objections)
                                    ├── Judge (verdicts)
                                    └── Final Ruling

Retrieval Pipeline

Questions are embedded and matched against document chunks using semantic vector search. Hybrid mode combines vector similarity with BM25 keyword matching via reciprocal rank fusion for improved recall.

Agent Workflow

  1. Witness generates an answer from retrieved evidence only.
  2. Claims are extracted from the Witness answer.
  3. Prosecutor challenges each claim against evidence.
  4. Judge assigns verdicts and confidence scores per claim.
  5. Final Ruling revises unsupported claims out of the answer.

Data Privacy

Documents and chunk metadata are stored locally in ChromaDB and SQLite. By default, embeddings and tribunal LLM calls use OpenAI. For fully local inference, configure Ollama or local embeddings in the API environment—see docs/privacy-and-security.md.

Limitations

  • Verdict quality depends on LLM capability and prompt adherence.
  • Complex PDF layouts may lose structure during extraction.
  • Evaluation metrics are heuristic, not ground-truth legal review.
  • No reranking model in MVP (stretch goal).