Qdrant Memory

Vector store memory for AI agents — stores conversation history in Qdrant and retrieves semantically relevant past context. Combines a recent message window with vector similarity search for hybrid recall.

When to Use

Use Qdrant Memory when:

The agent needs to recall information from conversations that happened days or weeks ago
You need semantic search ("find messages where the user talked about billing") rather than just recent messages
Building long-running assistants, knowledge workers, or support agents with persistent context
The conversation window alone isn't enough — the agent forgets important context from earlier

For short conversations or prototyping, Window Memory is simpler.

Configuration

Qdrant Memory is a sub-node — it connects to an AI Agent's Memory handle.

Setting	Type	Default	Description
`qdrantUrl`	string	`http://localhost:6333`	Qdrant server URL. Supports bindings. Use Qdrant Cloud or self-hosted.
`collectionName`	string	`tensorify_memory`	Qdrant collection name. Auto-created on first use if it doesn't exist.
`embeddingModel`	select	`text-embedding-3-small`	OpenAI embedding model for vectorizing messages. Choices: `text-embedding-3-small` (fast), `text-embedding-3-large` (accurate), `text-embedding-ada-002` (legacy).
`topK`	number	`5`	Number of semantically similar past messages to retrieve.
`recentWindow`	number	`4`	Number of most recent messages to always include (for immediate context).
`sessionKey`	string	—	Scope key for per-user memory. Use bindings like `user:{{ input.body.user_id }}`.

Required Secrets

Environment Variable	Description
`OPENAI_API_KEY`	Required for generating text embeddings via OpenAI's embedding API.
`QDRANT_API_KEY`	Optional. Required if your Qdrant instance uses API key authentication (e.g. Qdrant Cloud).

How It Works

Qdrant Memory uses a hybrid retrieval strategy:

Load phase (when the agent starts processing):
- Retrieves the recentWindow most recent messages (chronological order)
- Embeds the current user message using the configured embedding model
- Searches Qdrant for the topK most semantically similar past messages
- Deduplicates and merges both sets, returning them as conversation context
Save phase (after the agent responds):
- Embeds the new user message and assistant response
- Stores them in Qdrant with metadata (session key, timestamp, role)

This means the agent always has immediate context (recent messages) plus relevant historical context (semantic matches), even if the relevant information is hundreds of messages old.

Example

Build a support agent with long-term memory:

Add an AI Agent node with a customer support system prompt
Add a Qdrant Memory node
Connect Qdrant Memory to the agent's Memory handle
Set qdrantUrl to your Qdrant Cloud instance URL
Set sessionKey to customer:{{ "{{ webhook.body.customer_id }}" }}
Set topK to 5 and recentWindow to 4

The agent will remember recent messages AND recall relevant past interactions — even if the customer hasn't contacted you in weeks.

Common Gotchas

Qdrant must be running: The plugin connects to Qdrant via HTTP. If the server is unreachable, the agent falls back to no memory (it won't crash, but it won't have context).
Embedding costs: Every load() and save() call makes embedding API requests. With text-embedding-3-small, this is very affordable (~$0.02 per million tokens).
Collection auto-creation: The collection is created automatically on first use with the correct vector dimensions for your chosen embedding model.
Session key required for multi-user: Without a session key, all users share the same memory space. Always set a session key in production.