Self-Hosted · AI-Powered

Your organization's second brain

Transform scattered documents, emails, and files into an intelligent knowledge system. AI-generated answers with verifiable citations, powered by your data, on your infrastructure.

Explore Features How It Works

hippocortex query

$ hippocortex query "What were the key decisions in the Q4 planning meeting?"

Searching 2,847 documents across 14 sources...

Found 12 relevant chunks · Reranked · ACL filtered

The Q4 planning meeting established three key priorities:

1. Migrate auth service to OAuth 2.1 [meeting-notes-oct12.pdf:3]

2. Launch self-serve onboarding by Nov 15 [slack-thread-#product:142]

3. Reduce P95 latency below 200ms [jira-PERF-847, email-cto-oct14]

3 sources · 4 citations · confidence: high

The Problem

Your knowledge is scattered.
Your search is broken.

Knowledge workers spend 20% of their time searching for information. Traditional tools can't connect the dots.

Without Hippocortex

×

Keyword search misses context

Searching "Q4 decisions" doesn't find the email that says "we agreed to prioritize latency"
×

Answers without sources

AI chatbots hallucinate confidently. No way to verify where information came from
×

No access control

Sensitive documents leak into AI responses. HR data mixed with public content
×

Knowledge silos

Information trapped in emails, Slack, Drive, and wikis. No unified view

With Hippocortex

✓

Semantic understanding

Hybrid vector + keyword search finds answers by meaning, not just matching words
✓

Every answer has citations

Click any claim to see the original source. Frozen snapshots preserve provenance
✓

ACL filtering built in

Permissions checked before content reaches the AI. 13 red-team tests verify isolation
✓

Unified knowledge base

One system ingests all sources and builds an auto-generated wiki organized by topic

Capabilities

Everything you need for
intelligent knowledge management

A complete system, not a library. Ingest, search, synthesize, and govern your organization's knowledge.

Hybrid Search

Vector embeddings + full-text search fused via Reciprocal Rank Fusion. Graph-enhanced retrieval connects entities across documents.

Enterprise Governance

Fine-grained ACLs, sensitivity labels, pre-ranking permission filtering. GDPR-ready with tombstone propagation for right-to-forget.

Multi-Source Ingestion

Email (Gmail, Outlook), cloud drives, PDFs, web pages, Markdown. Thread-aware email processing with quote detection and delta extraction.

Verifiable Citations

Every AI-generated answer links to source chunks with frozen snapshots. Click to verify. Provenance tracking from ingestion to answer.

Auto-Generated Wiki

AI synthesizes clean, topic-organized wiki pages from raw documents. Taxonomy with 5 categories, bidirectional linking, and staleness detection.

Knowledge Graph

Automatic entity extraction (11 types), relationship mapping, and interactive graph visualization. Multi-hop reasoning across documents.

How It Works

From raw data to
intelligent answers

A production-grade pipeline that processes, enriches, and indexes your documents for instant retrieval.

Ingest

Connect your sources. Upload documents, link email accounts, paste URLs. Connectors handle authentication and incremental sync.

Gmail Outlook PDF Web S3

Parse & Chunk

Documents are parsed with format-aware extractors, then split into semantic chunks that preserve context and structure.

Markdown output 200-1500 tokens Content-hash IDs

Enrich

LLMs extract entities, assign topic tags, generate summaries, and classify content. All enrichments run in parallel for speed.

11 entity types Topic tags Summaries

Embed & Index

Vector embeddings stored in PostgreSQL with pgvector HNSW index. Full-text search via GIN index. No separate vector database needed.

pgvector HNSW tsvector GIN Postgres-canonical

Search & Generate

Hybrid search finds relevant chunks, ACL filtering enforces permissions, reranking prioritizes quality, and an LLM generates cited answers.

RRF fusion ACL filter Rerank Citations

Built With

Production-grade stack,
no exotic dependencies

Proven technologies that your team already knows. Everything runs in Docker Compose on a single machine.

FastAPI

Async Python backend

PostgreSQL

Vectors + relations + graph

Redis

Cache + rate limiting

Prefect

Durable orchestration

Next.js

React 19 frontend

Docker

Full stack in 5 containers

Voyage AI

Embeddings (512-dim)

OpenRouter

LLM gateway + fallbacks

How We Compare

Built different

Not a library, not a vector database. A complete, self-hosted knowledge system.

Capability	Hippocortex	RAG Libraries	Commercial RAG	Vector DBs
Self-hosted / data control	✓	✓	✗	✓
Production-ready application	✓	✗	✓	✗
ACL / governance	✓	✗	✓	✗
Knowledge graph	✓	✗	✗	✗
Citation provenance	✓	✗	Partial	✗
Auto wiki synthesis	✓	✗	✗	✗
No vendor lock-in	✓	✓	✗	Partial
Single-DB architecture	✓	✗	✗	✗

Your organization's second brain

Your knowledge is scattered.Your search is broken.

Without Hippocortex

Keyword search misses context

Answers without sources

No access control

Knowledge silos