Embeddings
A way of converting text (or images) into lists of numbers so an AI can measure how similar two pieces of content are.
In plain English
Embeddings are numerical representations of text. An embedding model reads a word, sentence, or document and outputs a list of hundreds or thousands of numbers — called a vector — that captures its meaning.
The key property: similar meanings produce similar vectors. "Dog" and "puppy" end up close together in this number space; "dog" and "democracy" end up far apart.
What they're used for:
- Semantic search — find documents by meaning, not just exact keywords
- Recommendations — surface similar articles or products
- RAG — retrieve relevant context before sending a query to an LLM
- Clustering — group similar pieces of content automatically