Mem0: Universal Memory Layer for AI Agents
Open-source intelligent memory layer that gives AI agents persistent, personalized memory capabilities with multi-level context retention, semantic retrieval, and support for 20+ vector databases.
- Step 1
What is Mem0?
Mem0 (pronounced "mem-zero") is a universal memory layer for AI agents that enhances AI assistants with persistent, personalized memory capabilities. Unlike traditional context windows that forget everything between sessions, Mem0 enables AI systems to remember user preferences, adapt to individual needs, and continuously learn over time.
With 56,891+ stars on GitHub, Mem0 has become the leading open-source solution for adding intelligent memory to AI applications. It achieved +26% accuracy over OpenAI's built-in memory on the LoCoMo benchmark, 91% faster responses than full-context approaches, and 90% lower token usage than maintaining full conversation history.
Mem0 is ideal for customer support chatbots, AI assistants, autonomous agents, healthcare applications, and any AI system that benefits from personalized, context-aware interactions across sessions.
- Step 2
Technology Stack & Architecture
Mem0 uses a hybrid graph, vector, and key-value store architecture to achieve superior performance. The system operates through an intelligent extraction and retrieval pipeline:
Core Components:
- LLM Processing: Default uses GPT-4-mini for fact extraction and memory updates. Supports OpenAI, Anthropic (Claude), HuggingFace, AWS Bedrock, Azure OpenAI, Gemini, Groq, Ollama, and any LiteLLM-compatible model
- Embedding Models: Default is text-embedding-3-small (1536 dimensions). Recommends Qwen 600M+ or GTE models for hybrid search scenarios
- Vector Storage: 20+ supported backends including Qdrant (default), Pinecone, ChromaDB, Weaviate, Milvus, pgvector, Supabase, Redis, Elasticsearch, MongoDB, and Faiss
- History Database: SQLite for local development, PostgreSQL for production
Memory Types:
- User Memory: Long-term preferences and facts about specific users
- Session Memory: Temporary context for individual conversations
- Agent Memory: System-level patterns and learned behaviors
Retrieval Strategy:
Mem0 employs multi-signal retrieval combining:
- Semantic similarity (vector search)
- BM25 keyword matching
- Entity matching and linking
- Temporal reasoning for time-aware retrieval
This hybrid approach outperforms any single retrieval signal and enables accurate fact recall even at scale.
Mem0 Architecture: ┌─────────────────────────────────────────┐ │ User Interaction / Agent Input │ └──────────────┬──────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ LLM Extraction (GPT-4-mini default) │ │ • Extract atomic facts │ │ • Deduplicate against existing memory │ └──────────────┬──────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Embedding (text-embedding-3-small) │ │ • Convert facts to 1536-dim vectors │ └──────────────┬──────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Vector Store (Qdrant/Pinecone/etc) │ │ • Index by user_id, session_id, agent │ │ • Store embeddings + metadata │ └──────────────┬──────────────────────────┘ │ [Retrieval Phase] │ ▼ ┌─────────────────────────────────────────┐ │ Multi-Signal Retrieval │ │ • Semantic similarity search │ │ • BM25 keyword matching │ │ • Entity matching │ │ • Temporal filtering │ └──────────────┬──────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Inject into LLM Context Window │ │ • Top-k relevant memories │ │ • Minimal token overhead │ └─────────────────────────────────────────┘ - Step 3
Prerequisites
Before installing Mem0, ensure your system meets these requirements:
Required:
- Python 3.10 or higher - Mem0 requires modern Python features
- OpenAI API key - Default LLM and embedding provider (obtain from OpenAI Platform)
Optional (for self-hosted vector databases):
- Docker - For running Qdrant, Milvus, or other containerized vector stores
- PostgreSQL - For pgvector or production history storage
- Redis - For Redis-based vector storage
For production deployments:
- API keys for your chosen LLM provider (OpenAI, Anthropic, etc.)
- Vector database credentials (Pinecone, Weaviate, etc.)
- Sufficient storage - Memory scales with usage; plan for growth
Mem0 offers three deployment tiers:
- Library (pip/npm) - Local testing and prototyping
- Self-Hosted Server - Docker-based deployment with authentication
- Cloud Platform - Zero-ops managed service at app.mem0.ai
# Verify Python version (must be 3.10+) python --version # Set OpenAI API key (required for default configuration) export OPENAI_API_KEY="your-openai-api-key" # Optional: Verify Docker (if using self-hosted vector stores) docker --version # Optional: Check available memory and disk space free -h df -h - Step 4
Installation - Python SDK
The Python SDK is the primary way to integrate Mem0 into your applications. Installation is straightforward:
# Basic installation pip install mem0ai # Install with vector store support (includes Chroma, Pinecone, etc.) pip install "mem0ai[vector_stores]" # Install with NLP enhancements for better extraction pip install "mem0ai[nlp]" # Install all optional dependencies pip install "mem0ai[all]" # Verify installation python -c "from mem0 import Memory; print('Mem0 installed successfully')" - Step 5
Quick Start - Default Configuration
Get started with Mem0 in under 5 minutes using the default configuration. This setup uses:
- GPT-4-mini for fact extraction
- text-embedding-3-small for embeddings (1536 dimensions)
- Qdrant vector database (local storage at
/tmp/qdrant) - SQLite history database at
~/.mem0/history.db
The default configuration is perfect for development and prototyping. For production, you'll want to customize the vector store and LLM providers (covered in later steps).
from mem0 import Memory # Initialize with default configuration m = Memory() # Add memories from a conversation # Mem0 automatically extracts and stores atomic facts messages = [ {"role": "user", "content": "Hi, I'm Alice. I'm a software engineer."}, {"role": "assistant", "content": "Hello Alice! Nice to meet you."}, {"role": "user", "content": "I love Python and prefer dark mode in my IDE."}, ] # Store memories with user_id for personalization result = m.add(messages, user_id="alice") print(f"Stored {len(result)} memories for alice") # Retrieve relevant memories query = "What programming languages does Alice like?" memories = m.search(query, user_id="alice") # Display retrieved memories for mem in memories: print(f"Memory: {mem['memory']}") print(f"Score: {mem['score']}") print(f"Created: {mem['created_at']}") print("---") - Step 6
Custom Configuration - Production Setup
For production deployments, you'll want to customize Mem0 to use your preferred LLM provider, vector database, and embedding model. Mem0's configuration system is flexible and supports all major providers.
Configuration Structure:
The config dictionary has several main sections:
llm- Language model for fact extraction and updatesembedder- Embedding model for vector generationvector_store- Vector database for memory storagehistory_db- Database for operation history
Common Production Configurations:
- OpenAI + Pinecone - Fully managed, zero-ops
- Anthropic + Qdrant - Claude for extraction, self-hosted vectors
- Ollama + ChromaDB - Fully local, no external APIs
- Azure OpenAI + pgvector - Enterprise PostgreSQL setup
from mem0 import Memory # Example 1: Custom OpenAI + Qdrant (cloud) config = { "llm": { "provider": "openai", "config": { "model": "gpt-4o-mini", "temperature": 0.1, "max_tokens": 2000, } }, "embedder": { "provider": "openai", "config": { "model": "text-embedding-3-small", "embedding_dims": 1536 } }, "vector_store": { "provider": "qdrant", "config": { "host": "localhost", "port": 6333, "collection_name": "memories", "embedding_model_dims": 1536, "distance": "cosine" } } } m = Memory.from_config(config) # Example 2: Anthropic Claude + Pinecone config_claude = { "llm": { "provider": "anthropic", "config": { "model": "claude-sonnet-4-6", "temperature": 0.0, "max_tokens": 4000, } }, "vector_store": { "provider": "pinecone", "config": { "api_key": "your-pinecone-api-key", "index_name": "mem0-memories", "environment": "us-west1-gcp" } } } # Example 3: Fully local with Ollama + ChromaDB config_local = { "llm": { "provider": "ollama", "config": { "model": "llama3.2", "ollama_base_url": "http://localhost:11434" } }, "embedder": { "provider": "ollama", "config": { "model": "nomic-embed-text" } }, "vector_store": { "provider": "chroma", "config": { "collection_name": "memories", "path": "./chroma_db" } } }⚠ Heads up: Never commit API keys to version control. Use environment variables or a secrets manager. For production, enable authentication and encrypt data at rest. - Step 7
Setting Up Vector Databases
Mem0 supports 20+ vector databases. Here's how to set up the most popular options:
Qdrant (Default - Recommended for Self-Hosted):
Qdrant is the default and offers excellent performance with Docker deployment.
Pinecone (Managed Cloud Service):
Zero-ops vector database with automatic scaling.
pgvector (PostgreSQL Extension):
Ideal if you're already using PostgreSQL.
ChromaDB (Simple Local Option):
Perfect for development and small-scale deployments.
Weaviate, Milvus, Redis, Supabase:
All supported with similar configuration patterns. Check the official Mem0 docs for specific configuration examples.
# Qdrant - Docker setup docker run -d -p 6333:6333 -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage \ qdrant/qdrant # Verify Qdrant is running curl http://localhost:6333/ # pgvector - Enable in PostgreSQL # First, install the extension (Ubuntu/Debian): sudo apt-get install postgresql-16-pgvector # Then in PostgreSQL: # CREATE EXTENSION vector; # ChromaDB - No setup needed, runs in-process # Just install: pip install chromadb # Pinecone - Cloud service, no local setup # Sign up at https://www.pinecone.io/ # Get API key and environment name from dashboard - Step 8
Core Operations - Add, Search, Update, Delete
Mem0 provides four core operations for managing memories:
1. Add Memories Extract and store facts from conversations or raw text.
2. Search Memories Retrieve relevant memories using semantic search with multi-signal ranking.
3. Update Memories Modify existing memories when information changes.
4. Delete Memories Remove memories by ID or clear all memories for a user.
Memory Levels:
You can scope memories to different levels:
user_id- Long-term user preferences and factsagent_id- Agent-specific learned behaviorsrun_idorsession_id- Temporary session context
Combine multiple IDs for hierarchical memory structures.
from mem0 import Memory m = Memory() # 1. ADD - Store memories from messages messages = [ {"role": "user", "content": "My name is Bob and I work at Acme Corp."}, {"role": "user", "content": "I prefer email updates over phone calls."} ] result = m.add(messages, user_id="bob") print(f"Added memories: {result}") # ADD - Store from raw text m.add("Bob's favorite color is blue.", user_id="bob") # 2. SEARCH - Retrieve relevant memories memories = m.search( query="What is Bob's communication preference?", user_id="bob", limit=5 # Top 5 results ) for mem in memories: print(f"{mem['memory']} (score: {mem['score']})") # 3. UPDATE - Modify existing memory memory_id = memories[0]['id'] m.update( memory_id=memory_id, data="Bob now prefers Slack messages over email." ) # 4. DELETE - Remove specific memory m.delete(memory_id=memory_id) # DELETE - Clear all memories for a user m.delete_all(user_id="bob") # GET - Retrieve all memories for a user all_memories = m.get_all(user_id="bob") print(f"Total memories for Bob: {len(all_memories)}") # MULTI-LEVEL - Combine user and session context m.add( "Currently discussing project budget", user_id="bob", session_id="meeting-2026-05-27" ) # Search with multiple filters memories = m.search( query="project discussion", user_id="bob", session_id="meeting-2026-05-27" ) - Step 9
CLI Tool - Command Line Interface
Mem0 provides a CLI for quick memory operations without writing Python code. This is useful for testing, debugging, and simple memory management tasks.
The CLI supports all core operations and can be integrated into shell scripts or automation workflows.
# Install CLI pip install mem0-cli # Initialize configuration (creates ~/.mem0/config.yaml) mem0 init # Add a memory mem0 add "Alice prefers Python over JavaScript" --user-id alice # Search memories mem0 search "programming preferences" --user-id alice # List all memories for a user mem0 list --user-id alice # Delete a specific memory mem0 delete <memory-id> # Clear all memories for a user mem0 clear --user-id alice # View CLI help mem0 --help mem0 add --help - Step 10
Integration with AI Frameworks
Mem0 integrates seamlessly with popular AI frameworks and platforms:
LangGraph Integration:
Use Mem0 as a memory layer for LangGraph customer support bots and autonomous agents.
CrewAI Integration:
Give CrewAI agents persistent memory across tasks and conversations.
Vercel AI SDK:
Add memory to Next.js AI applications.
ChatGPT Integration:
Mem0 provides a browser extension for Chrome that stores memories across ChatGPT, Claude, and Perplexity conversations.
AI Coding Assistants:
Integrate with Claude Code, Cursor, Windsurf, and other coding assistants for project-specific context retention.
# LangGraph Integration Example from langgraph.graph import StateGraph from mem0 import Memory memory = Memory() def chatbot_node(state): user_id = state["user_id"] message = state["message"] # Retrieve relevant memories memories = memory.search(message, user_id=user_id, limit=3) context = "\n".join([m['memory'] for m in memories]) # Add context to LLM prompt response = llm.invoke(f"Context: {context}\n\nUser: {message}") # Store new interaction memory.add([ {"role": "user", "content": message}, {"role": "assistant", "content": response} ], user_id=user_id) return {"response": response} # CrewAI Integration Example from crewai import Agent, Task, Crew from mem0 import Memory memory = Memory() researcher = Agent( role="Researcher", goal="Research topics thoroughly", backstory="Expert researcher with access to memory", memory=memory # Attach Mem0 to agent ) task = Task( description="Research AI agent memory systems", agent=researcher, user_id="research-session-001" # Memory scoping ) crew = Crew(agents=[researcher], tasks=[task]) result = crew.kickoff() - Step 11
Memory Management Best Practices
To get the most out of Mem0 in production, follow these best practices:
1. Memory Scoping Strategy:
- Use
user_idfor long-term user preferences and facts - Use
session_idfor temporary conversation context - Use
agent_idfor system-level learned patterns - Combine IDs for hierarchical organization
2. Fact Granularity:
- Let Mem0's LLM extract atomic facts automatically
- One fact per memory yields better retrieval than compound statements
- Example: "Alice likes Python" + "Alice works at Acme" is better than "Alice likes Python and works at Acme"
3. Retrieval Optimization:
- Use specific queries for better semantic matching
- Adjust
limitparameter based on context window size - Consider reranking for critical applications
4. Performance & Scalability:
- Enable asynchronous writes for non-blocking operations
- Use metadata filtering to scope searches
- Monitor token usage and adjust extraction prompts if needed
- Plan for ~6,000-7,000 tokens per operation in production
5. Privacy & Compliance:
- Implement user consent before storing memories
- Provide clear deletion mechanisms
- Use metadata to track retention policies
- Consider encryption for sensitive data
6. Memory Lifecycle:
- Regularly audit memory quality and relevance
- Implement staleness detection for time-sensitive facts
- Archive old sessions to prevent context pollution
- Version memories when facts change (e.g., job changes)
# Hierarchical memory scoping memory = Memory() # Long-term user facts memory.add( "Alice is a senior engineer at Acme Corp", user_id="alice", metadata={"category": "profile", "retention_days": 365} ) # Session-specific context memory.add( "Currently debugging authentication issue", user_id="alice", session_id="support-ticket-1234", metadata={"category": "session", "retention_days": 30} ) # Agent-level patterns memory.add( "Users from this domain prefer technical explanations", agent_id="support-bot-v2", metadata={"category": "pattern", "learned_from": "alice"} ) # Scoped retrieval with metadata filtering memories = memory.search( query="authentication", user_id="alice", session_id="support-ticket-1234", filters={"category": "session"} # Only session memories ) # Memory cleanup based on metadata import datetime def cleanup_old_sessions(): all_memories = memory.get_all() for mem in all_memories: metadata = mem.get('metadata', {}) if metadata.get('category') == 'session': created = datetime.datetime.fromisoformat(mem['created_at']) age_days = (datetime.datetime.now() - created).days if age_days > metadata.get('retention_days', 30): memory.delete(mem['id']) print(f"Deleted expired session memory: {mem['id']}") cleanup_old_sessions() - Use
- Step 12
Self-Hosted Deployment with Docker
For production workloads, deploy Mem0 as a self-hosted server with authentication and persistent storage. This setup includes:
- Docker containers for Mem0 server and vector database
- Authentication enabled by default
- Persistent volumes for data retention
- API access via HTTP endpoints
The self-hosted server provides the same API as the cloud platform but runs entirely on your infrastructure.
# docker-compose.yml for Mem0 self-hosted deployment version: '3.8' services: qdrant: image: qdrant/qdrant:latest ports: - "6333:6333" - "6334:6334" volumes: - qdrant_storage:/qdrant/storage restart: unless-stopped mem0-server: image: mem0ai/mem0:latest ports: - "8000:8000" environment: - OPENAI_API_KEY=${OPENAI_API_KEY} - QDRANT_HOST=qdrant - QDRANT_PORT=6333 - AUTH_ENABLED=true - SECRET_KEY=${SECRET_KEY} # Generate with: openssl rand -base64 32 depends_on: - qdrant restart: unless-stopped postgres: image: postgres:16-alpine environment: - POSTGRES_USER=mem0 - POSTGRES_PASSWORD=${POSTGRES_PASSWORD} - POSTGRES_DB=mem0_history volumes: - postgres_data:/var/lib/postgresql/data restart: unless-stopped volumes: qdrant_storage: postgres_data: - Step 13
Monitoring & Performance Optimization
Monitor your Mem0 deployment to ensure optimal performance and catch issues early:
Key Metrics to Track:
- Token Usage: ~6,000-7,000 tokens per add operation (LLM extraction + deduplication)
- Latency: p50 should be 0.88-1.09s for retrieval
- Memory Count: Track growth rate and storage usage
- Retrieval Quality: Monitor relevance scores and user feedback
- API Errors: Track failed operations and retry patterns
Performance Optimization:
- Reduce Token Usage: Use smaller LLMs for extraction (e.g., GPT-4o-mini vs GPT-4)
- Improve Latency: Enable caching, use local embeddings (FastEmbed), implement async writes
- Scale Vector Search: Use managed vector DBs (Pinecone, Weaviate Cloud) or scale Qdrant clusters
- Memory Quality: Regularly audit extracted facts, tune extraction prompts
Benchmarks (v3.0 Algorithm):
- LoCoMo benchmark: 91.6 score
- LongMemEval: 94.8 score
- Temporal reasoning improvement: +29.6 points
- Multi-hop task improvement: +23.1 points
# Performance monitoring example import time import logging from mem0 import Memory logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) memory = Memory() def monitor_operation(operation_name, func, *args, **kwargs): """Wrapper to monitor Mem0 operations""" start = time.time() try: result = func(*args, **kwargs) duration = time.time() - start logger.info(f"{operation_name} completed in {duration:.2f}s") # Log to metrics system (Prometheus, DataDog, etc.) # metrics.record_latency(operation_name, duration) return result except Exception as e: duration = time.time() - start logger.error(f"{operation_name} failed after {duration:.2f}s: {e}") # metrics.record_error(operation_name, str(e)) raise # Monitor add operation messages = [{"role": "user", "content": "I prefer dark mode"}] result = monitor_operation( "memory_add", memory.add, messages, user_id="alice" ) # Monitor search operation memories = monitor_operation( "memory_search", memory.search, "preferences", user_id="alice" ) # Analyze retrieval quality def analyze_retrieval_quality(memories, threshold=0.7): """Check if retrieved memories meet quality threshold""" if not memories: logger.warning("No memories retrieved") return False avg_score = sum(m['score'] for m in memories) / len(memories) logger.info(f"Average relevance score: {avg_score:.3f}") if avg_score < threshold: logger.warning(f"Low relevance score: {avg_score:.3f}") return False return True analyze_retrieval_quality(memories) - Step 14
Troubleshooting Common Issues
Here are solutions to common problems when working with Mem0:
Issue 1: "No memories retrieved" / Low recall
- Check that memories were added with correct
user_id - Verify vector database is running and accessible
- Try broader search queries
- Check embedding dimensions match between add and search
Issue 2: High latency / Slow responses
- Enable async writes if blocking on add operations
- Use local embedding models (FastEmbed) instead of API calls
- Reduce
limitparameter in search - Check vector database performance and indexing
Issue 3: Memory staleness / Outdated facts
- Implement regular memory audits
- Use timestamps to filter recent memories
- Update changed facts explicitly with
update()method - Consider archiving old memories to separate storage
Issue 4: High token usage / API costs
- Use smaller LLMs for extraction (GPT-4o-mini, Claude Haiku)
- Implement client-side deduplication before adding
- Batch add operations when possible
- Consider local LLMs (Ollama) for non-critical extractions
Issue 5: Vector database connection errors
- Verify database is running:
docker psor service status - Check firewall rules and port accessibility
- Validate credentials and connection strings
- Review database logs for specific errors
# Troubleshooting commands # Check Qdrant is running and accessible curl http://localhost:6333/ curl http://localhost:6333/collections # View Qdrant logs docker logs <qdrant-container-id> # Check Mem0 local storage ls -lh /tmp/qdrant/ ls -lh ~/.mem0/ # Test Mem0 installation python -c "from mem0 import Memory; m = Memory(); print('OK')" # Debug Python dependencies pip show mem0ai pip list | grep -E '(mem0|qdrant|openai|chromadb)' # Clear local Qdrant data (reset) rm -rf /tmp/qdrant/ # Clear Mem0 history database rm ~/.mem0/history.db # Enable debug logging in Python import logging logging.basicConfig(level=logging.DEBUG) # Check OpenAI API key echo $OPENAI_API_KEY | head -c 20 curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"⚠ Heads up: Clearing vector database storage will delete all memories permanently. Always backup production data before running destructive operations. - Check that memories were added with correct
- Step 15
Migration from OpenAI Memory or Other Solutions
If you're migrating from OpenAI's built-in memory, LangChain memory, or other solutions, here's how to transition to Mem0:
From OpenAI Memory:
OpenAI's memory API stores unstructured conversation context. Mem0 offers:
- 26% better accuracy on memory benchmarks
- 90% lower token usage
- Multi-signal retrieval (semantic + keyword + entity)
- Self-hosted options for data control
Migration Steps:
- Export existing memories (if possible)
- Reformat to Mem0 conversation format
- Bulk import using
add()method - Verify retrieval quality
- Update application code to use Mem0 API
From LangChain Memory:
LangChain's memory is tied to chain/agent execution. Mem0 provides:
- Standalone memory layer (framework-agnostic)
- Better scaling for multi-user applications
- Advanced retrieval with hybrid search
Mem0 can be used alongside LangChain or as a replacement.
# Migration example: Bulk import from existing system from mem0 import Memory import json memory = Memory() # Example: Import from JSON export with open('existing_memories.json', 'r') as f: data = json.load(f) # Bulk import with progress tracking for user_data in data['users']: user_id = user_data['id'] conversations = user_data['conversations'] for conv in conversations: # Convert to Mem0 message format messages = [ {"role": msg['role'], "content": msg['content']} for msg in conv['messages'] ] try: result = memory.add( messages, user_id=user_id, metadata={ "imported_from": "openai-memory", "original_timestamp": conv['timestamp'] } ) print(f"Imported {len(result)} memories for user {user_id}") except Exception as e: print(f"Error importing for {user_id}: {e}") # Verify migration total_memories = len(memory.get_all()) print(f"Total memories after migration: {total_memories}") # Quality check: Test retrieval test_query = "What are user preferences?" results = memory.search(test_query, limit=10) print(f"Retrieval test returned {len(results)} results") - Step 16
Advanced Use Cases
Mem0 enables sophisticated AI applications with persistent memory:
1. Multi-Agent Collaboration: Share memories across agents for coordinated task execution.
2. Personalized Customer Support: Remember customer history, preferences, and past issues across all interactions.
3. Healthcare Patient Tracking: Maintain long-term patient preferences, medical history, and care plans with HIPAA-compliant deployment.
4. Educational Adaptive Learning: Track student progress, learning style, and knowledge gaps to personalize instruction.
5. Smart Home Assistants: Learn household routines, preferences, and patterns over time.
6. Code Review Assistants: Remember project coding standards, past discussions, and team preferences.
7. Research Assistants: Accumulate domain knowledge across research sessions and papers.
The key is combining user-level, session-level, and agent-level memories to create rich, context-aware experiences.
# Advanced use case: Multi-agent customer support system from mem0 import Memory memory = Memory() class CustomerSupportSystem: def __init__(self): self.memory = Memory() def handle_customer_query(self, customer_id, query, agent_id): """ Multi-level memory retrieval: - Customer history (user_id) - Current session context (session_id) - Agent best practices (agent_id) """ # Retrieve customer history customer_memories = self.memory.search( query, user_id=customer_id, limit=5 ) # Retrieve agent-level best practices agent_memories = self.memory.search( query, agent_id=agent_id, limit=3 ) # Combine context context = { "customer_history": [m['memory'] for m in customer_memories], "agent_knowledge": [m['memory'] for m in agent_memories] } # Generate response with full context response = self.generate_response(query, context) # Store new interaction self.memory.add( [ {"role": "user", "content": query}, {"role": "assistant", "content": response} ], user_id=customer_id, metadata={ "agent_id": agent_id, "resolved": False, "category": "support" } ) return response def learn_from_interaction(self, customer_id, agent_id, outcome): """Store learned patterns for future use""" if outcome == "successful": self.memory.add( f"This approach worked well for customer issues", agent_id=agent_id, metadata={"learned_from": customer_id, "outcome": outcome} ) def generate_response(self, query, context): # Implement LLM call with context pass # Usage support = CustomerSupportSystem() response = support.handle_customer_query( customer_id="customer-123", query="My API key isn't working", agent_id="support-bot-v2" ) support.learn_from_interaction( customer_id="customer-123", agent_id="support-bot-v2", outcome="successful" )
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.