Context Compaction

Context compaction automatically manages the context window of a conversation to prevent sessions from failing when they approach the model's token limit. When a conversation grows large, older messages are intelligently summarised while recent context and critical artefacts (code blocks, file paths, error messages) are preserved verbatim.

How it works

Monitoring — The platform tracks token usage as messages accumulate in a thread
Triggering — When usage exceeds the threshold (default 80% of the model's context limit), compaction begins
Summarisation — Older messages are summarised by an LLM, keeping:
- The most recent messages verbatim (default: last 10)
- Code blocks and file paths
- Error messages and stack traces
Fallback — If summarisation fails, the system falls back to truncation (removing the oldest messages)

What you see

When compaction is running, a Context Compaction progress indicator appears in the chat UI, similar to a tool-call display. Compaction is otherwise transparent — conversations continue without interruption.

Default settings

Setting	Default	Description
Trigger threshold	80%	Compaction starts when the thread reaches 80% of the model's context limit
Verbatim window	10 messages	The most recent 10 messages are always kept in full
Target after compaction	65%	After compaction, the thread is reduced to approximately 65% of the context limit

Understanding the threshold

The threshold works with a response buffer. The platform reserves approximately 8% of the context for the model's response, so the effective trigger point is:

effective_trigger = context_limit × (1 - 0.08) × threshold_percent

For a 128K-token model with the default 80% threshold:

Effective limit: 128,000 × 0.92 = 117,760 tokens
Trigger point: 117,760 × 0.80 = 94,208 tokens

Graceful degradation

If summarisation fails (for example, because the configured model is temporarily unavailable):

The system retries once after a short delay
It then falls back to truncating the oldest messages
The fallback is logged internally

Chat sessions continue functioning even when compaction degrades to truncation mode.

note

Context compaction settings are managed at the platform level by BasePeak. Contact BasePeak support if you need the defaults adjusted for your workspace.

How it works​

What you see​

Default settings​

Understanding the threshold​

Graceful degradation​

How it works

What you see

Default settings

Understanding the threshold

Graceful degradation