SHA-256 hashes every request. Identical calls are served instantly from cache with zero API calls.
🧠
Semantic match
Embed prompts and find similar cached responses using cosine similarity or HNSW. "What is 2+2?" and "What does 2 plus 2 equal?" share the same cache entry.
🗄️
Five storage backends
Memory, file, Redis, SQLite, DynamoDB — pick what fits your infrastructure. All are optional peer dependencies.
🔌
Framework integrations
Drop-in middleware for Express and Hono. NestJS module with dependency injection. Works out of the box.
📦
Transparent proxy
Wrap your existing client with one line of code. No interface changes, full TypeScript types preserved.
🌊
Streaming support
Accumulates stream chunks, stores them, and replays as AsyncGenerator. Your streaming code doesn't change.