ToolMango glossary
AI terms, explained without the marketing.
Every entry below is hand-written by an editor who actually builds with these tools. No regurgitated definitions, no SEO filler — just the answers you'd want a senior engineer to give you over coffee.
20 entries · updated weekly
Concepts
- LLM (Large Language Model)A Large Language Model is a neural network trained on huge volumes of text to predict the next token, which produces emergent capabilities like reasoning, code generation, and translation.
- EmbeddingsAn embedding is a list of numbers that represents the meaning of a piece of text, image, or audio so similar things cluster together in vector space.
- AI agentAn AI agent is an LLM-driven system that takes actions in the world — calling tools, browsing, writing code, finishing tasks — instead of just answering questions.
- Context windowThe context window is the maximum number of tokens (text chunks) a language model can consider at once — both the prompt you send and the response it generates.
- AI hallucinationAn AI hallucination is when a language model produces confidently-stated information that is actually false — a fabricated citation, wrong fact, or invented API.
- TokenizationTokenization is the process of breaking text into chunks (tokens) — usually sub-word pieces — that an LLM actually reads and writes.
- Generative AIGenerative AI is any AI system that produces new content — text, images, audio, video, code — rather than classifying or predicting from fixed options.
- Multimodal AIA multimodal AI model handles multiple input or output types — text, images, audio, video — in the same model rather than needing separate models per modality.
- AI alignmentAI alignment is the field of research and engineering practice that aims to make AI systems behave in line with human values and intentions.
- InferenceInference is the act of running a trained AI model to produce an output — every API call to Claude, GPT, or Gemini is inference.
Architectures
- RAG (Retrieval-Augmented Generation)RAG combines a language model with a search step over your own documents, so answers stay grounded in your data instead of hallucinating.
- Vector databaseA vector database stores numerical embeddings of text/images/audio and finds similar items by distance, powering semantic search and RAG.
- MCP (Model Context Protocol)MCP is an open standard from Anthropic for connecting AI models to external tools, data sources, and applications through a consistent client-server protocol.
- Transformer architectureThe transformer is the neural network architecture introduced in 2017 that powers every major LLM — built around the attention mechanism that lets each token weigh all other tokens.
Techniques
- Fine-tuningFine-tuning is the process of further training a foundation model on your own examples so it learns to behave a specific way.
- Prompt engineeringPrompt engineering is the craft of writing instructions to a language model so it produces reliable, accurate, useful outputs.
- Chain-of-thought (CoT)Chain-of-thought prompting asks a model to reason step-by-step before producing its final answer, which substantially improves accuracy on hard problems.
- System promptThe system prompt is the high-level instruction at the start of an LLM conversation that defines the model's role, tone, constraints, and tools.
- Few-shot learningFew-shot learning is the technique of including 2-10 worked examples in the prompt to teach an LLM a new task without any retraining.
- RLHF (Reinforcement Learning from Human Feedback)RLHF is the post-training process where human raters score model outputs and the model is trained to produce outputs humans prefer.
Building something with these?
Our Stack Builder picks the best AI tools for your specific project in under 60 seconds — based on cost, time saved, and the technique you're using.
Build my stack →