Tag
economics
7 entries tagged economics · 7 terms.
Dictionary
Agent Loop Cost
The compounding token cost of a tool-using agent. Each turn of the loop feeds the entire conversation history plus the tool result back into the model, so costs grow non-linearly with the number of steps. A five-step agent can cost fifteen to twenty times a single-prompt equivalent.
Batch Inference
A processing mode where you submit a batch of requests and the provider returns results over minutes or hours instead of seconds. In exchange, you pay fifty percent less. Anthropic and OpenAI both offer it. For non-urgent work, it cuts your bill in half.
Input and Output Tokens
Input tokens are what you send to the model. Output tokens are what the model writes back. Output is almost always priced higher than input because it requires the model to actually generate rather than just read.
Prompt Caching
A feature that stores a chunk of your prompt on the provider's side so repeated calls read from the cache instead of re-processing. Claude caches are one-tenth the price of fresh input tokens. Hitting the cache is the single biggest cost lever in production AI.
Token Budget
A per-task, per-user, or per-month cap on how many tokens a feature is allowed to consume. Without one, costs drift upward invisibly. With one, every design decision has a known constraint.
Token Pricing
The per-million-token rate a provider charges for input and output. Expressed as dollars per million tokens. Claude Opus output is $75 per million tokens. Haiku is $5. The gap is fifteen times, which is what pricing tiers exist to navigate.
Tokens
The basic unit an AI model reads and writes. Roughly 3-4 characters of English per token, so 750 words equals about 1,000 tokens. Every API call is priced by tokens in and tokens out.