Token
What is a token?
In the context of AI and natural language processing (NLP), a token is a discrete unit of input text that a language model processes. Tokens typically correspond to words, subword segments (such as prefixes or suffixes), punctuation marks, or whitespace. The average token length in English is approximately 3 to 5 characters, though this varies depending on the tokenization algorithm employed (e.g., Byte Pair Encoding, WordPiece, or SentencePiece). For example:
- Dog → ["Dog"]
- Encyclopedia → ["Enc", "cyclo", "Pedia"]
Why are tokens important in AI or LLMs?
A token or process of tokenization is fundamental to language models, enabling them to interpret and generate text effectively. Tokens, created by a tokenizer, which is an algorithm or software tool that connects the human-readable text and model-ready input, determine how input data is segmented and processed, impacting model performance, efficiency, and accuracy. Proper tokenization helps manage complex language patterns, supporting a nuanced understanding within the context window.
Why tokens matter for agentic AI security
Tokens are not just a performance concern they are an attack surface.
Token stuffing is an attack where an adversary floods an agent's input with large volumes of tokens padding, repetition, irrelevant content, or injected text to consume the available context budget. When the context window fills up, the model truncates older content first. If an attacker can predict what gets truncated, they can use token stuffing to deliberately push safety instructions, system prompt rules, or prior conversation context out of the model's active memory leaving the agent operating without its guardrails.
Context budget manipulation is a related technique where an attacker crafts inputs that are disproportionately expensive in token terms — for example, using rare Unicode characters, whitespace sequences, or encoded content that tokenizes into many more tokens than expected. This forces the model to exhaust its context window faster than the application anticipates, creating a predictable truncation point the attacker can exploit.
Both attacks are variants of the broader context overflow attack category and are particularly relevant to agentic systems, where agents process long document chains, tool outputs, and multi-turn histories within a single context window.
→ See also: Context Window & Prompt Injection
Secure your agentic AI and AI-native application journey with Straiker
.avif)




