DRHP · IPO Prospectus · Annual Report Intelligence

Capital Event
Intelligence

CapScribe parses dense regulatory filings and extracts structured, source-backed capital events — allotments, bonus issues, rights issues, authorised capital changes — into clean JSON. Built for analysts, quant researchers and fintech pipelines that need reliable signal from unstructured documents.

Not a chatbot on PDFs. CapScribe reads real filings — tables, footnotes, contradictions — and turns them into structured, verifiable, source-backed capital history that holds up when the work is real.

00 Live Index

02 Ask the Filing

Investor-grade answers, grounded in cited events.

Extractive mode is free and instant. LLM ✦ synthesises prose via Claude Haiku (uses API credits). Every answer cites its source events.

03 Pipeline

From raw PDF to machine-readable capital history.

Extraction
PDF parser produces structured JSON events validated against schema.py.
Retrieval
ChromaDB vector store · all-MiniLM-L6-v2 embeddings for semantic recall.
Agent
LangGraph ReAct loop over search_events / get_event_detail tools.
API
FastAPI service — /ingest /search /ask /verify /report.
Evaluation
Gold-set harness scoring precision / recall / F1 on every extraction run.

04 What It Extracts

Every event, every field, schema-validated.

Allotments
Date · shares · face value · issue price · consideration · allottee category
Bonus Issues
Date · ratio · pre/post share count
Rights Issues
Date · ratio · price · record date
Authorised Capital Changes
Date · from/to amount · resolution type

05 Evaluation

Not a demo. Measured against a gold set.

Precision0.000
Recall0.000
F10.000

Measured on the real Ola Electric DRHP (primary allotment tables, hand-verified gold) via python evaluate.py fixtures/ola_drhp_extracted.json fixtures/ola_drhp_gold.json — 0 false positives (secondary transfers correctly excluded); the single miss is an allotment whose share count sits in prose, not a column.

06 Ingest

Upload a filing. Extract events in seconds.

Supports DRHP, IPO prospectus, annual reports · PDF only · OCR runs automatically on scanned documents

Drop a PDF here or click to browse
Max file size: 50 MB
Uploading…

07 Verify

Contradiction detection. Arithmetic & timeline checks.

Flags mismatched share counts, overlapping date ranges, and arithmetic errors across indexed events. Free — no API call.

08 Report

Capital history brief. Source-backed. Audit-ready.

Generates a structured brief citing the exact page and event for every claim. Free — uses indexed events only.