v0.1 · text conversations · files & images coming soon

Your AI is forgetting you.
Here's the math behind why.

Every token you add makes your AI slower, more expensive, and more forgetful. ContextCrunch measures your token waste, shows you the math, and compresses your context so you get more from every conversation.

Crunch my context → Learn the math

Free · No login · No data stored · Open source · Deployed on Google Cloud Run

95%of AI pilots show no measurable ROI

O(n²)attention cost — double tokens = 4× slower

$85kaverage monthly enterprise AI spend

How it works

Three steps. Zero server cost for steps 1 and 2.

Most of ContextCrunch runs instantly in your browser using pure math. The backend only activates when you request compression.

01 — instant, free

Paste or upload

Paste any conversation from Claude, ChatGPT, or Gemini. Or upload a PDF, PPTX, DOCX, XLSX, image, or code file. Token count appears immediately. No server call.

02 — instant, free

See your waste

Live breakdown by speaker, redundancy score, Shannon entropy, context window fill, and O(n²) latency impact. All real time in the browser.

03 — on request

Get compressed output

Click compress. The Python backend runs TurboQuant on your embeddings, finds redundant chunks, and rewrites your conversation into a shorter version ready to paste back.

04 — optional

Learn the math

Every score links to an interactive explanation. Simple analogy, technical formula, or full academic derivation. Toggle between levels. See the Python code that implements it.

Why ContextCrunch

Everything else is reactive. This isn't.

LangSmith, Braintrust, Arize — they all monitor after deployment. ContextCrunch works before and during, for anyone, free.

Not just a token counter

Tiktoken counts tokens. ContextCrunch tells you which are waste, why, and gives you a compressed version ready to paste.

Not developer-only

A grad student on Claude free tier and a startup's CTO have the same problem. ContextCrunch explains it at both levels.

Real math, not vibes

Shannon entropy. Product quantization. TurboQuant — Google's ICLR 2026 paper. Mathematically grounded compression.

Claude stops. ChatGPT forgets. You shouldn't.

Neither model tells you in real time what's happening. ContextCrunch shows you exactly where you stand before it becomes a problem.

Start crunching for free

No account. No credit card. No install. Paste your conversation and see what it's really costing you.

Try the tool → View on GitHub

Your AI is forgetting you.Here's the math behind why.