
cuesheet.
cuesheet is a Python library for testing LLM-calling code without burning the API. You wrap your test in `@cuesheet.cassette(...)`. The first run hits the real provider and saves the request and response to a YAML file you commit to your repo. Every run after that replays from the file, byte for byte. No network calls, no rate-limit flakes, no per-PR token cost. It works with any SDK that sits on top of httpx, so the same library covers Anthropic, OpenAI, Gemini, Mistral, Cohere, Groq, DeepSeek, Together, and anything else built on the standard transport. The pytest plugin auto-discovers cassettes in tests/cassettes/. Streaming responses are recorded as raw SSE chunks and replayed in order. API keys, JWTs, and emails are scrubbed before write. Cassettes are diff-friendly YAML so code review actually shows what changed. The library ships with a local web UI for browsing every recorded cassette. The dashboard watches the filesystem and pushes change events over SSE, so the page updates live as your tests run in another terminal. Open source, MIT, on GitHub.
pip install cuesheet

Want something like this?
I'm available for custom backend, web, and the weird integration work nobody else wants to touch.
