Tutorial, use cases and tips to get the most out of Claude Code.
Claude Code eating your tokens and missing your architecture? RTK and Repowise tackle both problems from complementary angles. A practical guide built on several months of production use.
Claude Code has become the default tool for many devs working in agent mode. But after a few weeks of intensive use, two problems always surface: the context fills up too fast (with noise that brings nothing), and the agent doesn't really understand the architecture of the project it's working on.
RTK and Repowise tackle these two problems from completely different angles. RTK compresses shell command output before it reaches the context. Repowise indexes your code into four intelligence layers (dependency graph, git history, auto-generated docs, architectural decisions) and exposes all of it via MCP.
This guide comes from real-world use of both tools on a mid-sized SaaS that's been in production for several months. No synthetic benchmarks, no copy-paste from READMEs. What works, what breaks, and how to make them work together.
Before talking about the tools, you need to understand where the tokens go. A typical Claude Code session burns 80,000 to 200,000 tokens in a few hours of serious work. The context window is capped at 200K, which sounds huge until you realize three things.
First, every shell command output enters the context in full. A cargo test or pnpm test that passes is 5,000 tokens of repeated "test passed" lines that the agent has zero need to read. An unfiltered git log averages 3,500 tokens based on RTK's measurements across 2,900+ real commands. Multiplied by the dozens of commands an agent runs autonomously, it adds up fast.
Second, every time Claude Code explores an unknown file, it reads the file in full — even if it only needs to understand one function. On a medium-sized project, understanding "how does authentication work" can require reading 30 to 50 files, easily 40,000 tokens just to answer an architecture question.
Third, the quota isn't unlimited. Anthropic's Max plans have weekly caps (240 to 480 hours depending on the tier), with a reset every 5 hours. When you hit the wall mid-refactor on Friday night, your evening is gone. The pressure on token consumption isn't theoretical, it's concrete.
That's the context where RTK and Repowise make sense. One compresses, the other structures. Both are free, open source, and installable in minutes.
RTK (Rust Token Killer) is a Rust-based CLI proxy that intercepts shell commands run by Claude Code, pipes them through a per-command filter, and returns a compressed version to the agent — typically 60 to 90% fewer tokens.
The binary weighs less than 5 MB, starts in under 10ms, and has zero runtime dependencies.
The clever bit about RTK is that it doesn't ask Claude Code to change behavior. It installs a hook (rtk-rewrite.sh) that intercepts Bash tool calls and rewrites the command on the fly. When Claude Code runs git status, what actually executes is rtk git status. Claude never sees the rewrite — it just receives already-filtered output.
The pipeline applied to each output is documented in the repo and breaks down into four steps:
Raw output (5,000 tokens)
↓ Smart filtering — strips ANSI codes, spinners, progress bars
↓ Grouping — consolidates similar lines, collapses repeated patterns
↓ Deduplication — one "test passed" instead of 200
↓ Truncation — keeps errors and warnings, trims verbose successes
Filtered output (500-2,000 tokens)The principle is elegant in its simplicity: Claude doesn't need to read every "test passed" line, it just needs to know how many tests passed and which ones failed. That information fits in two lines instead of two pages.
The short path (Linux/macOS/WSL):
bash
# Install via the official script
curl -sSL https://raw.githubusercontent.com/rtk-ai/rtk/master/install.sh | bash
# Verify
rtk --version # should show rtk 0.28.2 or higher
# Hook for Claude Code (default)
rtk init -g
# Restart Claude Code, then test
git status # automatically rewritten as rtk git statusOn native Windows, the auto-rewrite hook doesn't work (it needs a Unix shell). RTK falls back to CLAUDE.md injection mode: the agent receives RTK instructions but has to call commands manually. For a complete experience on Windows, WSL is recommended — the Windows + Git Bash + WSL combo holds up best for long Claude Code sessions.
To verify everything is in place:
bash
rtk init --show # shows install state
rtk gain # shows token savings statsThe rtk gain command is valuable because it gives you concrete numbers on your own project, not marketing averages.
From experience, here are the commands where the gain is most visible:
git log and git diff on active branches: up to 95% compression thanks to dedup of verbose commit messages and filtering of binary diffs
Test runners (pytest, cargo test, gradle test): everything that passes is summarized as a counter, only failures keep their full stack trace
Docker logs: redundant timestamps stripped, repeated messages grouped
pnpm install, npm install: the list of 800 packages becomes a counter + the actual warnings
Recursive find, grep: on a monorepo, the gain can exceed 90%
Conversely, RTK adds nothing to short or targeted commands (cat file.kt, head -20, wc -l). And that's normal: if the output is already 50 tokens, there's nothing to compress.
RTK only acts on Bash commands. Claude Code's built-in tools (Read, Grep, Glob) bypass the hook and reach the context unfiltered. To maximize savings, you should prefer explicit shell commands when it makes sense, and use the built-ins when usage warrants it (a Read on a specific file is still cleaner than a cat).
This limitation isn't a bug, it's a deliberate architecture decision. But it's worth knowing before expecting "magical" savings across all use cases.
Beyond the default install, here's the Claude Code config that's been running alongside RTK for several weeks:
bash
# Claude Code settings.json (excerpt)
{
"model": "sonnet",
"env": {
"MAX_THINKING_TOKENS": "0",
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
"DISABLE_NON_ESSENTIAL_MODEL_CALLS": "1"
}
}With CLAUDE_AUTOCOMPACT_PCT_OVERRIDE: 50, auto-compact triggers earlier, which combined with RTK lets me sustain 3-4 hour sessions without hitting the context wall. On a medium-sized project, auto-compact roughly shifts from "every 45 minutes" to "every 2 hours" — net win for the consistency of refactor sessions.
Repowise is an open-source codebase intelligence engine that indexes your repo into four layers (dependency graph, git history, auto-generated docs, architectural decisions) and exposes them to Claude Code via MCP through 7 to 8 precisely designed tools.
The goal isn't compression, it's structuring. Instead of letting Claude Code read 40 source files to understand a feature, the agent calls get_overview() or get_context() and receives a structured summary pulled from the actual graph.
1. The dependency graph. Repowise parses your code (Python, TypeScript, and other languages depending on the version) and builds a NetworkX graph of file/symbol relationships, calls with confidence scores, and communities detected via clustering algorithms. This graph is what lets the agent answer "what will this file impact" without reading everything.
2. The git history. Repowise mines commits to extract hotspots (files that change the most), ownership maps (who touches what), and most importantly hidden co-changes — file pairs that change together without an explicit dependency, signaling implicit risky coupling.
3. Auto-generated documentation. Based on the graph and git, Repowise generates a complete wiki: architecture overview, module map, entry points, tech stack. This wiki is then indexed as embeddings (LanceDB or pgvector) for semantic search.
4. Architectural decisions. This is the most interesting layer to me. You can record decisions ("we picked JWT over sessions because..."), link them to nodes in the graph, and Repowise tracks their freshness as commits roll in. When 8 of 14 files governed by a decision have been modified, the decision is flagged as "potentially stale" and surfaced to Claude Code via get_why().
This last layer solves a very concrete problem: when a senior dev leaves, the "why" of the code leaves with them. Keeping these decisions in the codebase rather than in dead Notion pages is a pragmatic bet that holds up.
Repowise installs via pip (Python 3.10+):
bash
pip install repowise
# Initialize in the repo
cd /path/to/my-project
repowise init
# "Graph + git only" mode, no LLM calls
repowise init --index-only
# With exclusions
repowise init -x vendor/ -x node_modules/ -x .next/At init, Repowise offers to set up the post-commit hook. Accepting ensures the intelligence layers stay in sync with the code on every commit. A typical incremental update touches 3 to 10 pages and finishes in under 30 seconds.
The config lives in .repowise/config.yaml:
yaml
provider: anthropic # or openai, ollama, litellm
model: claude-sonnet-4-5
embedding_model: voyage-3
git:
co_change_commit_limit: 500
blame_enabled: true
dead_code:
enabled: true
safe_to_delete_threshold: 0.7
maintenance:
cascade_budget: 30 # max pages fully regenerated per commit
background_regen_schedule: "0 2 * * *"The fully offline mode deserves a callout: with provider: ollama and a local embedding model, your code never leaves your machine. For a project under NDA or an internal codebase, that's a heavy argument against cloud solutions like Google CodeWiki.
After repowise init, two files are generated or updated:
CLAUDE.md at the project root — contains the architecture summary, module map, hotspot warnings, ownership map, hidden coupling pairs, active decisions, and dead code candidates. This file is regenerated on every repowise update from real data, not from an LLM call. Generation in under 5 seconds.
The MCP config for Claude Code, exposing Repowise's 8 tools: get_overview, get_context, get_hotspots, get_why, search, get_decision_health, and a few others depending on the version.
The typical usage pattern becomes:
Dev: "Why does the payments module use an in-process EventBus?"
Claude Code (internal): calls get_why("payments/")
Repowise: returns the "EventBus in-process only" decision + the list of governed files
Claude Code (response): "It's tied to a decision made 8 months ago to avoid
the Kafka dependency early on. The decision is flagged
for review because 8 of 14 files governed by this
decision have been modified since."Compared to a Claude Code that would read payments/processor.ts + payments/eventbus.ts + 5 other files to guess intent, the gain in tokens and relevance is massive.
The Repowise team publishes a benchmark on 48 SWE-QA tasks pulled from the pallets/flask repo, with claude-sonnet-4-6 at the end of the chain. The advertised result: 27× fewer tokens per query, 36% cheaper, equivalent answer quality to the Claude Code baseline.
On external benchmarks, I'd be cautious — a 27× factor is very project-dependent. My empirical measurements (across about twenty architecture questions asked before/after) come closer to 8× to 12× fewer tokens on "how does it work" and "what depends on what" questions. Still huge, but a long way from 27×. The official benchmark is a directional indicator, not a number to repeat as-is.
This is the essential point to understand before anything else. Asking "RTK or Repowise" is the wrong question. The right framing is: "what's eating your tokens right now?"
Criterion | RTK | Repowise |
|---|---|---|
Layer attacked | Shell command output | Codebase reading / exploration |
Mechanism | Compression / filtering | Indexing + structuring |
Installation | Rust binary, |
|
Ongoing cost | None, runs offline | Optional (LLM for wiki generation) |
100% offline mode | Yes, by default | Yes, with Ollama + local embeddings |
Typical gain | 60-90% on Bash commands | 8-27× on architecture questions |
Surface of effect | Anything via Bash tool | Anything via |
Sweet spot | Test/build/git intensive sessions | New dev onboarding, cross-module refactors |
RTK alone is enough if you mostly do "in-the-small" dev: you work on a few files you already know, the agent compiles/tests/commits a lot, and you don't need it to understand the global architecture. Typically: iterating on an isolated module, debugging, well-scoped feature additions.
Repowise alone makes sense if you mostly do "in-the-large" dev: exploring large codebases, onboarding to a new repo, cross-module refactors, technical debt audits. And if you'd rather not install yet another Rust binary.
The two together is the combo that makes the difference on a project that combines both use cases — which is the case as soon as you go beyond a solo prototype. RTK compresses output while Claude Code tests/commits/deploys, Repowise structures exploration while the agent reasons about architecture. The two tools don't collide — they operate at different layers of the pipeline.
RTK has no hidden cost. The binary is small, the hook is transparent, and maintenance amounts to a cargo install --git every now and then to upgrade.
Repowise has an upfront install cost — indexing a medium-sized repo takes 30 seconds to 2 minutes depending on how many commits to mine. And more importantly, you have to maintain the architectural decisions for layer 4 to keep its value. If you record three decisions early on and then nothing for six months, the tool loses its main appeal. That's a human cost, not a technical one, and it needs to be anticipated.
The kind of project where these two tools pay off best isn't the 50-dev startup with a 2,000-file Go monorepo. It's more like the solo dev or small team profile, on a mid-sized SaaS — a few hundred code files, a backend, a frontend, a database, and a homemade CI. Big enough that architecture questions start costing real tokens, small enough that there's no dedicated DevX team to optimize the situation.
Typical Claude Code use in that context means several sessions running in parallel across different IDEs, with separate sessions per feature or bugfix — a recommended practice to avoid context cross-contamination. That's the pattern where RTK + Repowise optimizations have the most measurable impact.
The clearest gain is on CI debugging sessions. When a CI pipeline fails, you often paste the log to Claude Code for diagnosis. Without RTK, a full pipeline log would routinely run 8,000 to 12,000 tokens. With RTK active (piping the log via cat file.log | rtk passthrough or rerunning the command locally through the rtk wrapper), the same log drops to 1,200-2,500 tokens. A 5× to 8× factor on a recurring usage.
On test sessions, the gain is more modest but consistent: around 70% on average, because most of the output is repeated "test passed" lines that RTK collapses efficiently.
The side effect I didn't anticipate: Claude's response quality also improves on long sessions. With less noise in the context, the agent stays focused. Subjective, but reproducible — noticeably fewer "Claude re-reads a file it already read 10 minutes ago" moments since RTK is installed.
The most striking use case shows up on refactors that touch a central module and its dependency network. Typically: a table shared across several endpoints, or a business layer called from heterogeneous entry points. Without an introspection tool, the first step of this kind of refactor is always the same — ask the agent to "list everything that depends on X", watch it read 30+ files, get a partial list back, and realize two days later that some indirect calls were missing.
With Repowise, a call to get_context("target_module") returns in seconds: the list of files touching this module, the hotspot score (high = unstable file, low = stable file), the hidden co-changes (modules that often shift together, signaling implicit coupling), and the ownership. It's exactly the mental scan you do when you know a project well, except here it's pulled from the actual graph and git history.
The auto-generated wiki also serves a purpose beyond feeding the agent: it's a solid base for documenting your architecture externally (technical articles, onboarding an occasional contractor, release READMEs). Instead of rewriting the "architecture" section from scratch, you start from the module map and layer human context on top.
On the architectural decisions side, starting light is the way — three structuring decisions early on, things like "this ORM choice over another", "this migration strategy", "this pipeline split". The payoff comes when the agent comes back three months later with "why not X here" and the answer is right there without you retyping it. The discipline cost is low if you record as you go, brutal if you try to backfill six months in one shot.
In the interest of honesty, two notable points of friction.
First, Repowise on Windows + Git Bash required some tinkering to get post-commit hooks working. repowise watch works fine, but the git hook needs a complete Unix-like environment — I ended up running it only from WSL, which is enough for my workflow but stays a friction point.
Second, RTK doesn't act on MCP tools. When Claude Code calls Repowise via MCP, the output of get_context() lands in the context without going through RTK. Thankfully, Repowise already returns structured and compact content, so the issue is marginal — but it's an open optimization area. For reference, RTK only covers Bash tool calls, this is an acknowledged and documented limitation.
If you're starting from scratch, here's the order I'd recommend after several months of use. It's the sequence that worked for me, not absolute truth.
Step 1 — Measure first. Install ccusage or equivalent and spend a week of normal use to get a consumption baseline. Without a baseline, the gains advertised by RTK or Repowise are unverifiable. This step feels bureaucratic but it stops you from convincing yourself it's working when you just had a quieter week.
Step 2 — Install RTK. It's the tool with the best effort/gain ratio out of the gate. 5 minutes to install, measurable gains within days, zero maintenance. If you're hesitating between the two, start here.
Step 3 — Install Repowise on your two or three main repos. No need to install it everywhere at once. Pick the repo where you spend the most time in "exploration" mode rather than "quick edit" mode. The ROI will be more visible.
Step 4 — Record your first architectural decisions. Three to five structuring decisions are enough early on. It's the layer that takes the most discipline to maintain, but pays off the most long-term. Don't leave it empty.
Step 5 — Measure again. Compare consumption after 2-3 weeks of using both tools. On a typical project, expect a 40 to 60% reduction in total consumption, with a more pronounced effect on long sessions than on short ones.
RTK and Repowise are the two tools that have had the most measurable impact on my Claude Code productivity, ahead of every other "Claude Code optimizer" I've tested in recent months. Not because they're magic, but because they attack the problem at different and complementary levels: output compression for one, context structuring for the other.
The classic pitfall with optimization tools is stacking solutions that overlap. RTK and Repowise have zero overlap surface — they're literally at different layers of the pipeline. Which makes them, in my view, the baseline combo to install as soon as you go beyond occasional solo Claude Code use.
For those still on the fence: both tools are free, open source, and installable in under 30 minutes total. The trial cost is laughable compared to the potential gain. The only real reason not to try them is not having a token consumption problem — which, with intensive Claude Code use, always ends up happening.
Does RTK work with agents other than Claude Code?
Yes. RTK officially supports Claude Code, GitHub Copilot, Cursor, Windsurf, Cline, Roo Code, Kilo Code, Gemini CLI, Codex (OpenAI), Antigravity, and OpenCode. Install flags differ (rtk init -g --gemini, rtk init --agent cursor, etc.), but the core behavior is identical.
Does Repowise send my code to an external service?
Not necessarily. With provider: ollama and a local embedding model, Repowise runs 100% offline. If you use provider: anthropic or openai, then yes, the content used to generate the wiki transits through the configured provider's API. The dependency graph and git history themselves are always computed locally.
How many tokens does RTK actually save?
Based on numbers published by the project (measured across 2,900+ real commands), the average is 89% compression. On my project, I measure 70 to 85% depending on the command. The rtk gain command will give you precise numbers on your own usage after a few days — that's the only measurement that really matters.
Does Repowise replace real documentation?
No. Repowise generates structural docs (architecture, modules, dependencies) from the code, but it doesn't replace user-facing documentation, detailed ADRs, or usage guides. It removes the "doc that describes the code" — the kind that goes stale as soon as a dev forgets to update it. It doesn't replace the "doc that explains intent and usage."
Is there a risk of conflict between RTK and Repowise?
None that I know of. The two tools operate on different channels (Bash tool for RTK, MCP for Repowise) and share no dependencies. You can install them in any order without worrying about interference.
Which is better between the two if I have to pick?
If your Claude Code use is dominated by shell commands (test, build, git, docker), start with RTK. If it's dominated by code exploration and understanding (refactor, onboarding, audit), start with Repowise. For mixed use (the most common case), running both is well worth more than the sum of the parts.
Do both tools work on Windows?
RTK runs in degraded mode on native Windows (no auto-rewrite hook, but CLAUDE.md injection still works). Repowise runs on Python 3.10+ so supports Windows natively, though some git hooks need a Unix-like environment. For both tools, WSL is recommended for a complete Windows experience.