Matyas.
ServicesProjectsExperienceBlogContact
CSGet in touch
Back to Dictionary
ai

Embedding

An embedding is a dense numerical vector representation of data — such as text, images, or code — in a high-dimensional space where semantically similar items are positioned closer together. Embeddings are fundamental to semantic search, recommendation systems, and RAG pipelines. They are generated by specialized models and typically stored in vector databases for efficient similarity lookups.

#ai

Related Terms

RAG

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by retrieving relevant documents from an external knowledge base before generating an answer. This allows the model to ground its output in up-to-date, domain-specific information rather than relying solely on its training data. RAG is widely used in enterprise chatbots, documentation assistants, and search-powered AI applications.

Context Window

A context window is the maximum amount of text (measured in tokens) that an LLM can process in a single interaction, encompassing both the input prompt and the generated output. Larger context windows allow models to handle longer documents, maintain extended conversations, and reason over more information at once. Context window sizes have grown rapidly — from 4K tokens in early GPT models to over 1M tokens in current models like Claude.

Fine-tuning

Fine-tuning is the process of further training a pre-trained AI model on a smaller, domain-specific dataset to adapt it for a particular task. Instead of training from scratch, fine-tuning adjusts existing model weights, which is significantly cheaper and faster. Common approaches include full fine-tuning, LoRA (Low-Rank Adaptation), and instruction tuning for aligning model behavior with specific requirements.

Hallucination

In AI, hallucination refers to when a language model generates confident-sounding but factually incorrect or fabricated information. This occurs because LLMs predict statistically likely text rather than retrieving verified facts. Mitigation strategies include RAG, grounding responses in source documents, structured output validation, and using temperature settings to reduce creative deviation.

Natural Language Processing

Natural Language Processing (NLP) is a branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP powers applications like chatbots, translation services, sentiment analysis, and text summarization. Modern NLP has been transformed by transformer-based models, which achieve remarkable performance on tasks that previously required extensive hand-crafted rules.

Diffusion Model

A diffusion model is a type of generative AI that creates data by learning to reverse a gradual noise-adding process. During training, the model learns to progressively denoise random noise into coherent outputs like images, audio, or video. Diffusion models power tools like Stable Diffusion, DALL-E, and Midjourney, and have become the dominant architecture for high-quality image generation.

All Words
Matyas.

Web apps, mobile apps, AI automation. I help businesses save time and money with tech that actually works.

Links

  • Services
  • Projects
  • Experience
  • Blog
  • Dictionary
  • Contact

Coming Soon

  • Case StudiesSoon
  • Resources

© 2026 Matyas Prochazka. All rights reserved.