Qdrant Memory
Vector store memory for AI agents — stores conversation history in Qdrant and retrieves semantically relevant past context. Combines a recent message window with vector similarity search for hybrid recall.
Use Qdrant Memory when:
- The agent needs to recall information from conversations that happened days or weeks ago
- You need semantic search ("find messages where the user talked about billing") rather than just recent messages
- Building long-running assistants, knowledge workers, or support agents with persistent context
- The conversation window alone isn't enough — the agent forgets important context from earlier
For short conversations or prototyping, Window Memory is simpler.
Qdrant Memory is a sub-node — it connects to an AI Agent's Memory handle.
| Setting | Type | Default | Description |
|---|---|---|---|
qdrantUrl | string | http://localhost:6333 | Qdrant server URL. Supports bindings. Use Qdrant Cloud or self-hosted. |
collectionName | string | tensorify_memory | Qdrant collection name. Auto-created on first use if it doesn't exist. |
embeddingModel | select | text-embedding-3-small | OpenAI embedding model for vectorizing messages. Choices: text-embedding-3-small (fast), text-embedding-3-large (accurate), text-embedding-ada-002 (legacy). |
topK | number | 5 | Number of semantically similar past messages to retrieve. |
recentWindow | number | 4 | Number of most recent messages to always include (for immediate context). |
sessionKey | string | — | Scope key for per-user memory. Use bindings like user:{{ input.body.user_id }}. |
| Environment Variable | Description |
|---|---|
OPENAI_API_KEY | Required for generating text embeddings via OpenAI's embedding API. |
QDRANT_API_KEY | Optional. Required if your Qdrant instance uses API key authentication (e.g. Qdrant Cloud). |
Qdrant Memory uses a hybrid retrieval strategy:
-
Load phase (when the agent starts processing):
- Retrieves the
recentWindowmost recent messages (chronological order) - Embeds the current user message using the configured embedding model
- Searches Qdrant for the
topKmost semantically similar past messages - Deduplicates and merges both sets, returning them as conversation context
- Retrieves the
-
Save phase (after the agent responds):
- Embeds the new user message and assistant response
- Stores them in Qdrant with metadata (session key, timestamp, role)
This means the agent always has immediate context (recent messages) plus relevant historical context (semantic matches), even if the relevant information is hundreds of messages old.
Build a support agent with long-term memory:
- Add an AI Agent node with a customer support system prompt
- Add a Qdrant Memory node
- Connect Qdrant Memory to the agent's
Memoryhandle - Set
qdrantUrlto your Qdrant Cloud instance URL - Set
sessionKeytocustomer:{{ "{{ webhook.body.customer_id }}" }} - Set
topKto5andrecentWindowto4
The agent will remember recent messages AND recall relevant past interactions — even if the customer hasn't contacted you in weeks.
- Qdrant must be running: The plugin connects to Qdrant via HTTP. If the server is unreachable, the agent falls back to no memory (it won't crash, but it won't have context).
- Embedding costs: Every
load()andsave()call makes embedding API requests. Withtext-embedding-3-small, this is very affordable (~$0.02 per million tokens). - Collection auto-creation: The collection is created automatically on first use with the correct vector dimensions for your chosen embedding model.
- Session key required for multi-user: Without a session key, all users share the same memory space. Always set a session key in production.
- AI Agent — the agent that uses this memory
- Window Memory — simpler sliding window memory
- Agent Memory Guide — choosing and configuring memory
