AI Glossary
Authoritative definitions for AI agents, MCP, Claude tool-use, and the broader AI engineering ecosystem. Each entry cross-references Wikipedia and Wikidata where the term has an established encyclopedia entry.
- Agent harnessagent runtimeAn agent harness is a software framework that runs an LLM in a loop with tool access, persistent context, and a stopping criterion — turning a one-shot model call into a multi-step workflow that can plan, execute, observe results, and re-plan until a task is complete.
- Context windowinferenceThe context window is the maximum number of tokens a large language model can process in a single forward pass, including the prompt, all in-context examples, retrieved documents, and the model's generated output — measured in thousands or millions of tokens.
- EmbeddingAI architectureAn embedding is a numerical vector representation of an input — typically text, but also images, audio, or code — that places semantically similar inputs near each other in a high-dimensional space, enabling semantic search, retrieval, classification, and clustering.
- Fine-tuningtrainingFine-tuning is the process of further training a pre-trained large language model on a smaller, task-specific dataset to specialize its behavior — adjusting either all model weights (full fine-tuning) or a small adapter layer (parameter-efficient fine-tuning, e.g., LoRA, QLoRA).
- Function callingagent runtimeFunction calling is OpenAI's name for the LLM tool-use capability, introduced June 2023, where the model emits structured JSON describing a function to call and its arguments, conforming to a JSON Schema the developer supplies.
- Large Language Model (LLM)AI architectureA Large Language Model (LLM) is a deep neural network — typically a transformer with billions to trillions of parameters — trained on large text corpora to predict the next token, then fine-tuned for instruction-following, dialogue, and increasingly tool use and reasoning.
- Model Context Protocol (MCP)protocolModel Context Protocol (MCP) is an open standard introduced by Anthropic in November 2024 for connecting AI assistants to data sources and tools through a JSON-RPC wire protocol over stdio or HTTP transports.
- Reinforcement Learning from Human Feedback (RLHF)trainingReinforcement Learning from Human Feedback (RLHF) is a training technique in which a language model's policy is fine-tuned using a reward model that has been trained on human preference rankings, aligning model output with human-judged quality on dimensions like helpfulness and harmlessness.
- Retrieval-Augmented Generation (RAG)AI architectureRetrieval-Augmented Generation (RAG) is a technique introduced by Meta AI in 2020 for grounding large language model outputs in retrieved external documents, combining a retriever (typically a vector index) with a generator (a language model) so the model's response is conditioned on relevant source material rather than parametric memory alone.
- TokenizationAI architectureTokenization is the process of splitting text into discrete units (tokens) that a language model treats as its atomic input — typically subword fragments such that common words are one token and rare words are several, balancing vocabulary size against representation efficiency.
- Tool useagent runtimeTool use is a capability where a large language model is given access to external functions (tools) it can invoke during inference, with the model deciding when to call which tool, generating structured arguments for the call, and incorporating the result into its subsequent generation.
- TransformerAI architectureThe Transformer is a neural network architecture introduced by Vaswani et al. in 2017 that uses self-attention to process sequences in parallel, replacing the recurrence of RNNs and LSTMs and becoming the foundational architecture for nearly every modern large language model.