RAG

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by retrieving relevant documents from an external knowledge base before generating an answer. This allows the model to ground its output in up-to-date, domain-specific information rather than relying solely on its training data. RAG is widely used in enterprise chatbots, documentation assistants, and search-powered AI applications.

#ai

Related Terms

Neural Network

A neural network is a computational model inspired by the human brain, consisting of layers of interconnected nodes (neurons) that process data by adjusting weighted connections during training. Deep neural networks with many layers form the foundation of modern AI, powering everything from image recognition to language understanding. Common architectures include feedforward networks, convolutional networks (CNNs), and transformers.

Multimodal AI

Multimodal AI refers to models that can process and generate multiple types of data — such as text, images, audio, and video — within a single system. Models like GPT-4o and Claude can accept both text and image inputs, enabling use cases like visual question answering, document analysis, and UI understanding. This convergence is blurring the lines between previously separate AI disciplines.

Chain of Thought

Chain of Thought (CoT) is a prompting technique that encourages an LLM to break down complex reasoning into intermediate steps before arriving at a final answer. By explicitly reasoning through each step, models achieve significantly better accuracy on math, logic, and multi-step problems. Extended thinking and "thinking" tokens in models like Claude represent a built-in form of chain-of-thought reasoning.

Embedding

An embedding is a dense numerical vector representation of data — such as text, images, or code — in a high-dimensional space where semantically similar items are positioned closer together. Embeddings are fundamental to semantic search, recommendation systems, and RAG pipelines. They are generated by specialized models and typically stored in vector databases for efficient similarity lookups.

Context Window

A context window is the maximum amount of text (measured in tokens) that an LLM can process in a single interaction, encompassing both the input prompt and the generated output. Larger context windows allow models to handle longer documents, maintain extended conversations, and reason over more information at once. Context window sizes have grown rapidly — from 4K tokens in early GPT models to over 1M tokens in current models like Claude.

Natural Language Processing

Natural Language Processing (NLP) is a branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP powers applications like chatbots, translation services, sentiment analysis, and text summarization. Modern NLP has been transformed by transformer-based models, which achieve remarkable performance on tasks that previously required extensive hand-crafted rules.

All Words

RAG

Related Terms

Neural Network

Multimodal AI

Chain of Thought

Embedding

Context Window

Natural Language Processing

Got a project in mind?

RAG

Related Terms

Neural Network

Multimodal AI

Chain of Thought

Embedding

Context Window

Natural Language Processing

Got a project in mind?