LLMs | Notion

A Cheat Sheet and Some Recipes For Building Advanced RAG | by Andrei | Jan, 2024 | LlamaIndex Blog

Token

AI models convert your input into a string of tokens, which they process. A single token could represent a word, a few letters,

Context Window

The context window in AI models refers to how much information the model can consider at any given time. The model can use all the information in a context window to decide what it should output. It can be thought of as the memory of the model for that particular conversation. Gemini Pro 1.5 has a 1 million token context window, meaning it remembers or can take into consideration 1 million tokens when responding to queries. The bigger the context window, the mor info it can take in from a prompt, making its output more consistent, relevant and useful.

Vector Database

A vector database stores data in vector form. A vector refers to an array of numbers, each representing a different feature or attribute of the data. Used in ML because they allow for efficient storage and manipulation of high dimensional data.

Useful for tasks like similarity search where the goal is to find items in the database that are similar to a given query. By represnting data as vectors, similarity between items can be computed using metrics like cosine similarity or Euclidean distance

Embedding

Refers to a mapping of words, phrases or other entities to high dimensional vectors. Numerical representations of words in a continous vector space, where words with similar meanings are mapped to nearby points.

Embeddings capture semantic relationships between words and are commonly used as features in ML models for tasks like text classification, sentiment analysis and machine translation

Chunking

Process where a sequence of words, typically a sentence is divided into syntactically meaningful chunks such as noun phrases or verb phrases. Usually an intermediate step in NLP tasks including named entity recognition, part of speech tagging and parsing.

Stages of RAG

Five key stages of RAG, which in turn will be a part of any larger application you build