AI Glossary

146 authoritative definitions for AI agents, MCP, Claude tool-use, and the broader AI engineering ecosystem — cross-referenced with Wikipedia and Wikidata.

LLM conceptsActive LearningA machine learning paradigm where an algorithm can actively query a user (or some other information source) to label new data points with the desired outputs.
Agentic AI conceptsAgent BudgetA predefined limit on the resources—tokens, API calls, wall-clock time, or monetary cost—that an agent is permitted to consume on a single task or within a time period, enforced by the orchestrator or a watchdog process.
Agentic AI conceptsAgent CheckoutThe act of an agent claiming exclusive ownership of a task from a shared queue to prevent duplicate execution by multiple agents running concurrently in a multi-agent system.
Agentic AI conceptsAgent EvaluationThe systematic assessment of an AI agent's performance across task completion, tool use accuracy, memory reliability, safety, and cost efficiency, typically using automated harnesses, human raters, or model-graded judges.
agent runtimeAgent harnessAn agent harness is a software framework that runs an LLM in a loop with tool access, persistent context, and a stopping criterion — turning a one-shot model call into a multi-step workflow that can plan, execute, observe results, and re-plan until a task is complete.
Agentic AI conceptsAgent HeartbeatA periodic signal emitted by a running agent to its watchdog or orchestrator to indicate it is alive and making progress, enabling detection of stuck, crashed, or infinitely looping agents.
Agentic AI conceptsAgent LaneThe defined scope of actions, tools, and decisions an agent is authorized to take, preventing scope creep and ensuring each agent in a multi-agent system stays within its designated responsibility boundary.
Agentic AI conceptsAgent LoopThe continuous perceive-think-act cycle an AI agent executes: it reads observations from its environment, selects an action (often a tool call), executes it, receives the result, and iterates until a termination condition is met.
Agentic AI conceptsAgent MemoryThe collective set of information-persistence mechanisms available to an AI agent—spanning in-context working memory, episodic long-term memory, and semantic knowledge stores—that maintain continuity across turns and sessions.
Agentic AI conceptsAgent OrchestrationThe coordination layer that manages the lifecycle, routing, scheduling, and communication of multiple AI agents, ensuring tasks flow through the right agents in the right order with appropriate resource allocation.
agent-architectureAgent RegistryAn agent registry is a central directory or metadata store where AI agents and their capabilities (tools, resources, prompts) are registered, allowing for discovery, version management, and orchestration by the host.
Agentic AI conceptsAgent ScaffoldingThe non-model infrastructure that surrounds an LLM to make it act as an agent: tool definitions, loop control, memory management, state persistence, error handling, and observability hooks.
Agentic AI conceptsAgent SOULA structured markdown document that defines an agent's identity, role lane, definition-of-done, escalation triggers, reporting format, and behavioral constraints—serving as its immutable core identity loaded into every system prompt.
Agentic AI conceptsAgentic LoopA synonym for the agent loop, emphasizing the iterative nature of autonomous AI execution: the model repeatedly perceives its context, selects an action, observes the result, and continues until a stopping condition is reached.
Agentic AI conceptsAgentic WorkflowA structured process or series of tasks where an AI agent independently breaks down a complex objective into smaller, actionable steps, executes them using available tools, and iterates based on intermediate results.
infrastructureAI GatewayAn AI gateway is a proxy layer between the AI application and the model/service provider that handles common cross-cutting concerns like rate limiting, logging, auditing, RBAC, and model routing.
LLM conceptsAlignment TaxThe reduction in raw task performance that results from applying alignment techniques (RLHF, Constitutional AI, safety training) to an LLM, reflecting the tradeoff between safety/helpfulness and peak capability.
Agentic AI conceptsAnthropic Agent SDKThe umbrella SDK from Anthropic for building, testing, and deploying agentic applications on Claude models, encompassing the Claude Agent SDK and related tooling for evaluation, observability, and deployment.
InfrastructureAPI KeyA secret token used by software to authenticate requests to an API, usually identifying the calling project, account, or service.
InfrastructureAPI Rate LimitA constraint imposed by an API service provider on the number of requests a user or client can make within a specified timeframe (e.g., requests per minute).
LLM conceptsAttention MaskA mechanism used in Transformer-based models to prevent the model from 'paying attention' to certain parts of the input, such as padding tokens, ensuring they do not influence the output.
LLM conceptsAttention MechanismA neural network component that computes a weighted sum of value vectors based on query-key similarity scores, allowing the model to selectively focus on relevant parts of its input regardless of positional distance.
securityAudit TrailAn audit trail is a chronological, tamper-evident record of activities, interactions, and operations performed within a system, essential for security and compliance.
Agentic AI conceptsAutonomous AgentAn AI agent that pursues multi-step goals with minimal human intervention, making its own decisions about tool use, sub-task ordering, and error recovery throughout the task lifecycle.
InfrastructureCachingThe practice of storing computed results or reusable data so later requests can be served faster or more cheaply.
LLM conceptsCapability OverhangA situation where an AI model possesses underlying capabilities that are not apparent in standard evaluations, which may be unlocked by better prompting, fine-tuning, or scaffolding—creating a gap between measured and achievable performance.
Agentic AI conceptsChain of ThoughtA prompting technique that elicits intermediate reasoning steps from an LLM before its final answer, improving accuracy on multi-step tasks by making the model's reasoning process explicit and sequential.
LLM conceptsChunkingThe process of splitting source content into smaller passages for embedding, retrieval, summarization, or context-window management.
ProductsClaudeAnthropic's family of frontier AI assistants and models, commonly used for writing, reasoning, coding, tool use, and agent workflows.
Agentic AI conceptsClaude Agent SDKAnthropic's official SDK for building autonomous and multi-agent systems on top of Claude models, providing primitives for tool use, sub-agent spawning, memory, and structured agent communication.
Agentic AI conceptsCoder AgentAn AI agent specialized in writing, editing, testing, and debugging code by combining an LLM with file-system access, shell execution, and version-control tools in a persistent software development environment.
ProductsCodexOpenAI's coding-oriented agent and model line used for reading repositories, editing code, running checks, and collaborating on software tasks.
Agentic AI conceptsCognitive ArchitectureThe high-level design of an AI system's information-processing components—perception, memory, reasoning, planning, and action—and the connections between them, analogous to human cognitive psychology frameworks like SOAR and ACT-R.
LLM conceptsCompletionThe text generated by a language model in response to a prompt, representing the model's continuation of the input sequence according to its learned distribution over tokens.
LLM conceptsConfabulationThe specific sub-type of hallucination in which a model generates false information that is plausibly connected to real facts in its training data, filling in memory gaps with coherent but invented details—analogous to the neurological confabulation seen in amnesia patients.
Evaluation conceptsConfusion MatrixA table that compares predicted labels against true labels, showing counts of true positives, false positives, true negatives, and false negatives.
LLM conceptsConstitutional AIAnthropic's alignment method in which a language model critiques and revises its own outputs according to a set of written principles (the 'constitution'), reducing reliance on human feedback for safety training.
Agentic AI conceptsContext InjectionThe practice of dynamically inserting retrieved memories, tool results, or external data into an LLM's context window at inference time to provide information beyond what is encoded in the model's weights.
LLM conceptsContext LengthThe maximum number of tokens an LLM can process in a single forward pass—both input and output combined—determining the amount of text the model can read and reason over at once.
inferenceContext windowThe context window is the maximum number of tokens a large language model can process in a single forward pass, including the prompt, all in-context examples, retrieved documents, and the model's generated output — measured in thousands or millions of tokens.
LLM conceptsCross-EntropyThe information-theoretic loss function used to train language models, measuring the average number of bits needed to encode the true next token under the model's predicted distribution—minimizing it is equivalent to maximizing likelihood.
ProductsCursorAn AI-native code editor that integrates model assistance into repository navigation, code generation, chat, and iterative software changes.
Agentic AI conceptsMemory AgentAn AI agent component or dedicated sub-agent responsible for storing, indexing, and retrieving information across sessions, maintaining continuity beyond a single context window.
agent-architectureMemory BankA Memory Bank is a persistent, structured storage pattern used by AI agents to maintain state, user preferences, and project context across multiple sessions and disparate conversations.
LLM conceptsMixture of ExpertsA neural network architecture where each input is routed to a small subset of specialized sub-networks (experts) by a learned gating function, enabling very large total parameter counts while keeping per-token computation constant.
Evaluation conceptsMMLUMassive Multitask Language Understanding, a broad benchmark that tests model performance across many academic and professional multiple-choice subjects.
protocolModel Context Protocol (MCP)Model Context Protocol (MCP) is an open standard introduced by Anthropic in November 2024 for connecting AI assistants to data sources and tools through a JSON-RPC wire protocol over stdio or HTTP transports.
Agentic AI conceptsMulti-Agent SystemAn architecture in which multiple AI agents, each with distinct roles and capabilities, collaborate or compete to accomplish tasks that exceed what any single agent could do within one context window.
LLM conceptsMulti-Head AttentionAn attention variant that runs multiple independent attention operations (heads) in parallel on different learned linear projections of the input, allowing the model to attend to different representation subspaces simultaneously.
LLM conceptsMultimodalDescribing an AI model that processes and generates information across multiple data modalities—such as text, images, audio, and video—within a unified architecture rather than with separate single-modality models.
Agentic AI conceptsParallel Tool CallsA capability allowing an LLM to emit multiple tool calls in a single response turn, which are then executed concurrently by the scaffolding, reducing total latency compared to sequential tool execution.
LLM conceptsPerplexityA metric for language model quality defined as the exponentiated average negative log-likelihood per token on a test set, measuring how surprised the model is by text—lower perplexity indicates better fit to the data distribution.
Agentic AI conceptsPlanning AgentAn AI agent specialized in decomposing high-level goals into ordered, executable sub-tasks, selecting tools and sub-agents, and maintaining a plan that adapts as new information arrives during execution.
LLM conceptsPositional EncodingA mechanism that injects information about the position of each token in a sequence into a transformer model, compensating for the permutation-invariance of self-attention and enabling the model to understand word order.
LLM conceptsPre-trainingThe initial large-scale training phase of a language model in which it learns to predict the next token (or masked tokens) over a massive text corpus, establishing general language understanding before any task-specific fine-tuning.
Evaluation conceptsPrecisionAn evaluation metric that measures how many positive predictions were actually correct.
LLM conceptsPromptThe input text—consisting of instructions, context, examples, and/or a user query—provided to a language model to elicit a specific type of completion or response.
optimizationPrompt CachingPrompt caching is a technique that stores the results of intermediate processing of prompt tokens to avoid re-computing them, significantly reducing inference latency and costs for frequently reused context.
Agentic AI conceptsPrompt EngineeringThe practice of designing, iterating, and optimizing text inputs to language models—including instructions, examples, context, and formatting—to reliably elicit desired outputs without modifying model weights.
infrastructureRate LimitingRate limiting is the practice of restricting the number of API requests a user or agent can make within a specific time period to maintain service stability and prevent abuse.
securityRBAC (Role-Based Access Control)RBAC is a security model that restricts system access based on the roles assigned to individual users or agents within an organization, rather than assigning permissions directly to users.
Agentic AI conceptsReAct PromptingA prompting framework that interleaves Reasoning (Thought) and Acting (Action/Observation) steps, guiding an agent to think before each tool call and incorporate the observation into subsequent reasoning.
model-architectureReasoning ModelA reasoning model is an LLM specifically fine-tuned or trained to prioritize extended, multi-step chain-of-thought processing over immediate response generation.
Evaluation conceptsRecallAn evaluation metric that measures how many actual positive cases the system successfully found.
Agentic AI conceptsReflection AgentAn AI agent that critiques its own prior outputs, identifies errors or gaps, and generates improved responses by iterating over self-evaluation feedback before producing a final answer.
trainingReinforcement Learning from Human Feedback (RLHF)Reinforcement Learning from Human Feedback (RLHF) is a training technique in which a language model's policy is fine-tuned using a reward model that has been trained on human preference rankings, aligning model output with human-judged quality on dimensions like helpfulness and harmlessness.
LLM conceptsRerankingA retrieval step that reorders initially retrieved results using a more precise scoring model or rule before passing context to the next stage.
Agentic AI conceptsResearch AgentAn AI agent that autonomously searches the web, reads documents, synthesizes findings, and produces cited reports by iteratively formulating queries and evaluating retrieved information against a research goal.
LLM conceptsRetrievalThe process of finding relevant external information, documents, memories, or records to use in a model or agent workflow.
AI architectureRetrieval-Augmented Generation (RAG)Retrieval-Augmented Generation (RAG) is a technique introduced by Meta AI in 2020 for grounding large language model outputs in retrieved external documents, combining a retriever (typically a vector index) with a generator (a language model) so the model's response is conditioned on relevant source material rather than parametric memory alone.
Agentic AI conceptsSampling ParametersThe set of numeric knobs—temperature, top-p, top-k, max tokens, stop sequences—that control how an LLM samples tokens from its output distribution, trading off diversity against determinism.
InfrastructureSandboxingRunning code, tools, or agent actions inside a constrained environment to limit filesystem, network, credential, or system access.
LLM conceptsScaling LawsEmpirical relationships showing that LLM performance improves predictably as a power law with increases in model parameters, training compute, and data size, enabling researchers to forecast model quality before training.
Agentic AI conceptsSelf-ConsistencyA decoding strategy that samples multiple independent chain-of-thought reasoning paths from a model at non-zero temperature, then takes the majority-vote answer across those paths, improving accuracy over a single greedy sample.
Agentic AI conceptsSemantic MemoryA type of agent memory that stores general facts, domain knowledge, embeddings of documents, and distilled insights—without temporal binding—enabling knowledge-base-style retrieval for any task.
LLM conceptsSoftmaxA mathematical function that converts a vector of raw scores (logits) into a probability distribution by exponentiating each score and normalizing by their sum, ensuring all outputs are positive and sum to 1.
LLM conceptsSpeculative DecodingAn inference acceleration technique that uses a fast draft model to generate multiple token candidates in one step, then verifies them in parallel with the target model, reducing wall-clock generation time without changing output distribution.
LLM conceptsSpeech-to-TextAn AI capability that transcribes spoken audio into written text, using end-to-end neural models trained on large audio-text paired datasets to handle diverse accents, languages, and audio conditions.
protocolStdio TransportStdio transport is an inter-process communication (IPC) method where two processes communicate by writing to each other's standard input and output streams.
observabilityStructured LoggingStructured logging is a practice where logs are emitted in a consistent, machine-readable format (typically JSON) containing key-value pairs rather than unstructured plain text.
Agentic AI conceptsStructured OutputA mode of LLM generation in which the model is constrained to produce output conforming to a predefined schema—typically JSON—enabling reliable downstream parsing without post-processing heuristics.
Agentic AI conceptsSub-AgentAn AI agent spawned by an orchestrator or parent agent to handle a specific sub-task, operating within a delegated scope and budget, and returning results to the parent upon completion.
LLM conceptsSupervised Fine-TuningThe process of continuing gradient-descent training on a pre-trained model using a curated labeled dataset, adapting it to a specific task, style, or domain by updating model weights through standard cross-entropy loss.
Evaluation conceptsSWE-benchA software engineering benchmark that evaluates whether AI systems can resolve real GitHub issues by modifying repositories and passing tests.
prompt-engineeringSystem InstructionSystem instructions (also called system prompt or meta-prompt) define the core behavioral constraints, goals, and persona of an AI model, set at the start of a session before any user input is processed.
Agentic AI conceptsSystem PromptA privileged instruction block prepended to an LLM conversation that establishes the model's persona, capabilities, constraints, and task context before any user or assistant turns appear.
observabilityTelemetryTelemetry is the automated collection and transmission of data from remote sources (like agents or distributed tools) to an IT system for monitoring, analysis, and alerting.
LLM conceptsTemperatureA scalar parameter that controls the sharpness of an LLM's output probability distribution before sampling: values below 1.0 make the distribution more peaked (deterministic), values above 1.0 make it flatter (more random).
LLM conceptsText-to-ImageA generative AI capability that produces images from natural-language text descriptions, typically using diffusion models or autoregressive image token models trained on paired image-text datasets.
LLM conceptsText-to-SpeechA speech synthesis capability that converts written text into natural-sounding audio speech, using neural models trained on human speech recordings to generate expressive, voice-cloned, or custom-voiced audio.
AI architectureTokenizationTokenization is the process of splitting text into discrete units (tokens) that a language model treats as its atomic input — typically subword fragments such that common words are one token and rare words are several, balancing vocabulary size against representation efficiency.
Agentic AI conceptsTool ResultThe structured response returned to an LLM after it executes a tool call, containing the output data (or error information) that the model incorporates into its next reasoning step.
agent runtimeTool useTool use is a capability where a large language model is given access to external functions (tools) it can invoke during inference, with the model deciding when to call which tool, generating structured arguments for the call, and incorporating the result into its subsequent generation.
LLM conceptsTop-k SamplingA token sampling strategy that restricts the sampling pool to the k highest-probability tokens at each generation step, then samples from those k tokens according to their normalized probabilities.
LLM conceptsTop-p (Nucleus Sampling)A token sampling strategy that restricts sampling to the smallest set of tokens whose cumulative probability mass meets or exceeds a threshold p, dynamically adjusting the vocabulary size based on the distribution's shape.
AI architectureTransformerThe Transformer is a neural network architecture introduced by Vaswani et al. in 2017 that uses self-attention to process sequences in parallel, replacing the recurrence of RNNs and LSTMs and becoming the foundational architecture for nearly every modern large language model.