Documentation Index
Fetch the complete documentation index at: https://reasonblocks.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
ReasonBlocksConfig is a single dataclass that controls every ReasonBlocks capability. Instead of wiring individual middleware classes by hand, you fill in a config object and pass it to build_middleware(), which assembles the correct middleware stack in the right order automatically.
Complete example
build_middleware() assembles the list in the correct order: ReasonBlocksMiddleware first, then GeneralMonitorMiddleware (if enabled), then TokenSavingMiddleware last so it compresses what earlier middleware injected before the LLM call goes out.Tier toggles
These fields turn each major capability on or off.Enable customer-scoped instance-level pattern injection. E1 retrieves the top-1 pattern from your customer’s pattern scope. Disable to stop E1 injections entirely without affecting E2, E3, or monitors.
Enable pattern-level injection keyed on the monitor state description. E2 retrieves patterns from the shared pattern store that match the current failure family detected by monitors.
Enable universal standing-rule injection. E3 fires on the first agent call of every run (scroll-all) to establish baseline guidance. Subsequent calls skip E3 unless the state changes.
Enable server-side trajectory monitor evaluation. When fired, monitor results steer which E-trace tier retrieves and what gets injected into the system prompt.
Enable the v1 rule-firing general monitor pack. This opt-in pack runs five rule-based detectors (repeated attempts, error without diagnosis, exploration sprawl, idle response, low-novelty tail) with a per-rule cooldown. When enabled,
GeneralMonitorMiddleware is inserted between ReasonBlocksMiddleware and TokenSavingMiddleware.Enable tool-output compression and the early-exit nudge.
TokenSavingMiddleware is appended last in the stack so it compresses whatever earlier middleware has injected before the LLM call goes out.Enable trace distillation. When
True, the SDK submits completed session traces to the ReasonBlocks API at session end so new E1 patterns can be mined for future runs. Set to False to prevent trace data from this customer’s runs from being ingested — useful for staging environments or data-residency requirements.Token-saving levers
These fields control howTokenSavingMiddleware compresses the message history and when it injects the early-exit nudge.
Character length above which a
ToolMessage body is eligible for head+tail truncation. Tool messages shorter than this threshold are left untouched.Number of characters to keep from the beginning of a compressed tool message.
Number of characters to keep from the end of a compressed tool message. Combined with
ts_head_keep_chars, the total kept content is at most 900 + 700 = 1600 chars plus an omission marker.Number of the most recent
ToolMessage instances to leave uncompressed. The agent keeps full visibility into the step it is actively reasoning about.Master toggle for tool-output head+tail compression. Set to
False to disable compression while keeping the early-exit nudge active.Enable the early-exit nudge. Once the agent has made at least
ts_early_exit_min_call_index model calls, the middleware checks for loop-like signals (streak, hedge, diversity) and injects a HumanMessage telling the agent to stop and submit its current best answer.Minimum number of model calls before the early-exit nudge can fire. Prevents premature exits on short runs.
Perplexity compression
Perplexity compression applies word-level keep/drop decisions to stale messages, going further than head+tail truncation. It is off by default and requires you to supply aWordClassifier callable.
Enable LLMLingua-2-style word-level compression on stale messages. Requires
ts_perplexity_classifier to be set. When the classifier is None, perplexity compression is silently skipped even if this flag is True.A
WordClassifier callable with signature (list[str]) -> list[bool]. Each True in the output means the corresponding word is kept. Use reasonblocks.token_saving.make_anthropic_classifier to create a production classifier backed by Claude Haiku.Messages within the last N model calls keep full fidelity — perplexity compression does not touch them. Calls 0 through N-1 back are in the “recent” tier.
Messages between
ts_perplexity_recent_cutoff and this cutoff use the mid keep ratio. Messages older than this cutoff use the old keep ratio.Target word-keep ratio for messages in the mid tier. A value of
0.55 means roughly 55% of words in mid-tier messages are retained.Target word-keep ratio for old-tier messages (older than
ts_perplexity_mid_cutoff calls). More aggressive than the mid ratio.Number of words per classifier window. Smaller windows produce finer-grained decisions at the cost of more classifier API calls. Larger windows are coarser but cheaper.
General monitor levers
These fields apply only whenenable_general_monitor=True.
Total tool-call budget for the general monitor. Several detectors (exploration sprawl, idle response, low-novelty tail) use this as the reference limit for their proportional thresholds.
Minimum number of model calls between firings of the same rule. Prevents repeated injections from the same detector on consecutive steps.
Routing and scoping
Customer identifier used by E1 retrieval to narrow pattern search to this customer’s scope. When
None, E1 retrieves from the global pattern store. Set this to the same customer identifier you use in your user database — for example, your tenant ID or organization slug.Server-side monitor weight profile. Built-in values are
"coding" (default), "pr_review", and "qa". Each profile shifts which monitors are weighted most heavily. See monitor weight profiles for the exact weight differences between profiles.Optional per-monitor weight override applied on top of the profile. Partial dicts are accepted — monitors not listed fall through to the profile, then to server defaults. Unknown monitor names are dropped server-side, and negative weights are clamped to
0.