# Memory Box API Documentation > Memory Box is a universal memory layer for AI applications. It provides persistent, semantic memory for any LLM without relying on provider-native features. You own the memory, control the context, and hydrate any model on demand. Key concepts: - **Provider-agnostic**: Works with OpenAI, Anthropic, Google, or any LLM. Switch models while keeping your agent's memories. - **Semantic search**: Vector, hybrid, and chronological retrieval modes for different use cases. - **Multi-tenant**: Complete namespace isolation per agent. Each agent has its own memory space. - **Sub-50ms latency**: Global edge infrastructure with Anycast routing across 10+ regions. - **Embedding included**: Qwen3-Embedding-8B with 1024 dimensions—you don't manage embeddings. Base URL: `https://memory-box-api.fly.dev` --- # Authentication Memory Box uses a two-layer authentication model: - **Authorization**: Bearer token for API access. Get your key from the dashboard. - **X-Agent-ID**: Identifies the agent namespace for memory isolation. ```bash curl -X GET https://memory-box-api.fly.dev/api/v1/memories \ -H "Authorization: Bearer mb_live_..." \ -H "X-Agent-ID: support-agent" ``` --- # Multi-Tenant Mode Memory Box supports multi-tenant deployments where each tenant has isolated data and API keys: - API keys are prefixed with the tenant slug: `mbx__<32-char-secret>` - Each tenant's memories and agents are completely isolated - Admin API is available for tenant management ```bash # Multi-tenant API key format mbx_acme_a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6 ``` --- # Rate Limiting Rate limits are applied per API key: | Endpoint | Rate Limit | |----------|------------| | `/api/*` | 100 requests/second | | `/admin/*` | 10 requests/minute | | `/health`, `/ready`, `/docs` | Unlimited | When rate limited, you'll receive a `429 Too Many Requests` response with a `Retry-After` header. --- # Memory Endpoints ## POST /api/v1/memories - Store a Memory Store a new memory for an agent. The memory will be automatically embedded and indexed for semantic search. **Request Body:** ```json { "text": "User prefers dark mode and metric units", "metadata": { "model": "gpt-4o", "session_id": "sess_abc123" } } ``` **Response:** ```json { "id": "mem_7f3b2a1c", "agent_id": "support-agent", "text": "User prefers dark mode and metric units", "created_at": "2024-01-15T10:30:00Z", "tokens_used": 12, "metadata": { "model": "gpt-4o", "session_id": "sess_abc123" } } ``` **cURL Example:** ```bash curl -X POST https://memory-box-api.fly.dev/api/v1/memories \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: support-agent" \ -H "Content-Type: application/json" \ -d '{"text": "User prefers dark mode"}' ``` --- ## GET /api/v1/memories - Search Memories Search memories using vector similarity, chronological ordering, or hybrid mode. Returns memories ranked by relevance. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `q` | string | No | Search query text. Required for vector and hybrid modes | | `mode` | string | No | Search mode: "vector" (default), "chronological", or "hybrid" | | `limit` | integer | No | Maximum number of results (1-100, default: 10) | **Response:** ```json { "memories": [ { "id": "mem_7f3b2a1c", "text": "User prefers dark mode and metric units", "similarity": 0.92, "created_at": "2024-01-15T10:30:00Z" } ], "count": 1 } ``` **cURL Example:** ```bash curl -X GET "https://memory-box-api.fly.dev/api/v1/memories?q=user+preferences&mode=vector" \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: support-agent" ``` --- ## GET /api/v1/memories/{id} - Get Memory by ID Retrieve a specific memory by its unique identifier. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The memory ID | **Response:** ```json { "id": "mem_7f3b2a1c", "agent_id": "support-agent", "text": "User prefers dark mode and metric units", "created_at": "2024-01-15T10:30:00Z", "updated_at": "2024-01-15T10:30:00Z", "metadata": {} } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/api/v1/memories/mem_7f3b2a1c \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: support-agent" ``` --- ## DELETE /api/v1/memories/{id} - Delete Memory Permanently delete a memory. Returns 204 No Content on success. This action cannot be undone. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The memory ID to delete | **cURL Example:** ```bash curl -X DELETE https://memory-box-api.fly.dev/api/v1/memories/mem_7f3b2a1c \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: support-agent" ``` --- ## GET /api/v1/memories/{id}/related - Get Related Memories Find memories semantically related to a specific memory. Useful for building context chains. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The source memory ID | | `min_similarity` | number | No | Minimum similarity threshold (0-1, default: 0.7) | **Response:** ```json { "source_id": "mem_7f3b2a1c", "related": [ { "id": "mem_8e4c3b2d", "text": "User enabled high contrast mode", "similarity": 0.87, "created_at": "2024-01-14T09:15:00Z" } ] } ``` **cURL Example:** ```bash curl -X GET "https://memory-box-api.fly.dev/api/v1/memories/mem_7f3b2a1c/related?min_similarity=0.8" \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: support-agent" ``` --- # Agent Endpoints ## POST /api/v1/agents/{agent_id}/activate - Activate Agent Create and activate a new agent namespace. Call this before storing memories for a new agent. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `agent_id` | string | Yes | Unique agent identifier | **Response:** ```json { "agent_id": "support-agent", "status": "active", "created_at": "2024-01-15T10:00:00Z" } ``` **cURL Example:** ```bash curl -X POST https://memory-box-api.fly.dev/api/v1/agents/support-agent/activate \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" ``` --- ## GET /api/v1/agents/{agent_id}/stats - Get Agent Stats Retrieve statistics for an agent including memory count, token usage, and timing information. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `agent_id` | string | Yes | The agent identifier | **Response:** ```json { "agent_id": "support-agent", "namespace": "tenant_support-agent", "memory_count": 1247, "total_tokens": 45230, "first_memory_at": "2024-01-01T00:00:00Z", "last_memory_at": "2024-01-15T10:30:00Z" } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/api/v1/agents/support-agent/stats \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" ``` --- ## DELETE /api/v1/agents/{agent_id} - Deactivate Agent Deactivate an agent and permanently delete all associated memories. Returns 204 No Content on success. This action cannot be undone. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `agent_id` | string | Yes | The agent identifier to deactivate | **cURL Example:** ```bash curl -X DELETE https://memory-box-api.fly.dev/api/v1/agents/support-agent \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" ``` --- # Admin Endpoints Admin endpoints are only available in multi-tenant mode and require the `X-Admin-API-Key` header. ## POST /admin/tenants - Create Tenant Create a new tenant in multi-tenant mode. Returns the tenant details and initial API key. **Request Body:** ```json { "name": "Acme Corp", "slug": "acme" } ``` **Response:** ```json { "id": "tenant_abc123", "name": "Acme Corp", "slug": "acme", "created_at": "2024-01-15T10:00:00Z", "api_key": "mbx_acme_abc123def456..." } ``` **cURL Example:** ```bash curl -X POST https://memory-box-api.fly.dev/admin/tenants \ -H "X-Admin-API-Key: $ADMIN_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "Acme Corp", "slug": "acme"}' ``` --- ## GET /admin/tenants - List Tenants List all tenants in the system. Only available in multi-tenant mode. **Response:** ```json { "tenants": [ { "id": "tenant_abc123", "name": "Acme Corp", "slug": "acme", "created_at": "2024-01-15T10:00:00Z", "status": "active" } ] } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/admin/tenants \ -H "X-Admin-API-Key: $ADMIN_API_KEY" ``` --- ## GET /admin/tenants/{id} - Get Tenant Get details for a specific tenant by ID. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The tenant ID | **Response:** ```json { "id": "tenant_abc123", "name": "Acme Corp", "slug": "acme", "created_at": "2024-01-15T10:00:00Z", "status": "active", "memory_count": 15420, "agent_count": 12 } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/admin/tenants/tenant_abc123 \ -H "X-Admin-API-Key: $ADMIN_API_KEY" ``` --- ## DELETE /admin/tenants/{id} - Disable Tenant Disable a tenant. This will prevent all API access for the tenant but does not delete data. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The tenant ID to disable | **Response:** ```json { "id": "tenant_abc123", "status": "disabled" } ``` **cURL Example:** ```bash curl -X DELETE https://memory-box-api.fly.dev/admin/tenants/tenant_abc123 \ -H "X-Admin-API-Key: $ADMIN_API_KEY" ``` --- ## POST /admin/tenants/{id}/keys - Generate API Key Generate a new API key for a tenant. Keys are shown only once at creation time. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The tenant ID | **Request Body:** ```json { "name": "Production Key" } ``` **Response:** ```json { "id": "key_xyz789", "name": "Production Key", "key": "mbx_acme_xyz789abc...", "created_at": "2024-01-15T11:00:00Z" } ``` **cURL Example:** ```bash curl -X POST https://memory-box-api.fly.dev/admin/tenants/tenant_abc123/keys \ -H "X-Admin-API-Key: $ADMIN_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name": "Production Key"}' ``` --- ## GET /admin/tenants/{id}/keys - List API Keys List all API keys for a tenant. Key values are not returned, only metadata. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The tenant ID | **Response:** ```json { "keys": [ { "id": "key_xyz789", "name": "Production Key", "created_at": "2024-01-15T11:00:00Z", "last_used_at": "2024-01-15T12:30:00Z" } ] } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/admin/tenants/tenant_abc123/keys \ -H "X-Admin-API-Key: $ADMIN_API_KEY" ``` --- ## DELETE /admin/tenants/{id}/keys/{key_id} - Revoke API Key Revoke an API key. The key will immediately stop working. Returns 204 No Content on success. **Parameters:** | Name | Type | Required | Description | |------|------|----------|-------------| | `id` | string | Yes | The tenant ID | | `key_id` | string | Yes | The key ID to revoke | **cURL Example:** ```bash curl -X DELETE https://memory-box-api.fly.dev/admin/tenants/tenant_abc123/keys/key_xyz789 \ -H "X-Admin-API-Key: $ADMIN_API_KEY" ``` --- # MCP Server Memory Box implements the Model Context Protocol (MCP) for integration with Claude Desktop and other MCP-compatible clients. ## Overview The MCP server provides a standardized way for AI assistants to interact with Memory Box. It uses OAuth 2.1 with PKCE for secure authentication. ### OAuth 2.1 Authentication Flow 1. Client discovers OAuth metadata at `/.well-known/oauth-authorization-server` 2. Client registers dynamically at `/oauth/register` 3. User authorizes at `/oauth/authorize` with PKCE challenge 4. Client exchanges code for token at `/oauth/token` 5. Client accesses MCP endpoint with Bearer token ### Available Tools | Tool | Description | |------|-------------| | `store_memory` | Store a new memory with automatic embedding | | `search_memories` | Semantic search across stored memories | | `get_memory` | Retrieve a specific memory by ID | | `delete_memory` | Delete a memory permanently | | `find_related` | Find semantically related memories | | `get_stats` | Get agent statistics and memory count | | `list_buckets` | List available memory buckets/namespaces | --- ## GET /mcp - MCP Discovery Returns basic server information for MCP client discovery. No authentication required. **Response:** ```json { "name": "memory-box", "version": "1.0.0" } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/mcp ``` --- ## POST /mcp - MCP JSON-RPC Main MCP endpoint for JSON-RPC 2.0 messages. Supports initialize, ping, tools/list, tools/call, prompts/list, and prompts/get methods. **Request Body (initialize):** ```json { "jsonrpc": "2.0", "id": 1, "method": "initialize", "params": { "protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": { "name": "my-client", "version": "1.0.0" } } } ``` **Response:** ```json { "jsonrpc": "2.0", "id": 1, "result": { "protocolVersion": "2024-11-05", "capabilities": { "tools": { "listChanged": false }, "prompts": { "listChanged": false } }, "serverInfo": { "name": "memory-box", "version": "1.0.0" } } } ``` **Request Body (tools/list):** ```json { "jsonrpc": "2.0", "id": 2, "method": "tools/list" } ``` **cURL Example:** ```bash curl -X POST https://memory-box-api.fly.dev/mcp \ -H "Authorization: Bearer $OAUTH_ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' ``` --- ## DELETE /mcp - Close MCP Session Closes an active MCP session. Requires the MCP-Session-Id header. **Headers:** | Name | Required | Description | |------|----------|-------------| | `MCP-Session-Id` | Yes | The session ID to close | **cURL Example:** ```bash curl -X DELETE https://memory-box-api.fly.dev/mcp \ -H "Authorization: Bearer $OAUTH_ACCESS_TOKEN" \ -H "MCP-Session-Id: session_abc123" ``` --- # System Endpoints ## GET /health - Liveness Check Simple liveness check to verify the server is running. Use this for basic health monitoring. **Response:** ```json { "status": "ok", "timestamp": "2024-01-15T10:30:00Z" } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/health ``` --- ## GET /ready - Readiness Check Comprehensive readiness check that verifies all dependencies (database, vector store) are operational. Use this for deployment orchestration. **Response:** ```json { "status": "ok", "timestamp": "2024-01-15T10:30:00Z", "checks": { "database": "ok", "vector_store": "ok" } } ``` **cURL Example:** ```bash curl -X GET https://memory-box-api.fly.dev/ready ``` --- # Architecture ## Embedding Model: Qwen3-Embedding-8B Memory Box uses Qwen3-Embedding-8B with 1024-dimensional vectors for semantic memory retrieval. ### Model Selection We selected Qwen3-Embedding-8B for: - **Multilingual coverage**: 119 languages across major language families (Indo-European, Sino-Tibetan, Afro-Asiatic, Austronesian, Dravidian, Turkic, and more) - **Extended context window**: 32,768 tokens per embedding—full documents without chunking - **Benchmark performance**: MTEB English 75.22, CMTEB Chinese 73.83, MTEB Code 80.68 ### Why 1024 Dimensions - **Retrieval quality**: Captures sufficient semantic nuance; higher dimensions show <2% improvement - **Storage efficiency**: ~4KB per vector (float32), predictable costs at scale - **Query latency**: Keeps p95 under 50ms for production workloads - **Cross-compatibility**: Industry standard (BAAI bge-large, E5-large) ### Performance in Production - <50ms p95 latency for vector similarity queries - 32K token context for full-document embeddings - 119 languages supported without model switching - 99.99% uptime for embedding generation --- ## Infrastructure & Scaling Memory Box API is deployed on Fly.io using containerized Machines distributed across multiple global regions. ### Global Edge Network Fly.io's Anycast routing automatically directs API requests to the nearest healthy region. Memory Box runs in 10+ regions: - **North America**: Ashburn, Chicago, San Jose - **Europe**: Amsterdam, London, Frankfurt - **Asia-Pacific**: Tokyo, Sydney, Singapore - **South America**: São Paulo ### Autoscaling Strategy Hybrid autoscaling model: - **Traffic-based**: Machines autostop after 5 minutes idle, autostart on requests (<1 second startup) - **Metrics-based**: Warm buffer maintained based on queue depth and memory pressure - **Concurrency limits**: Tuned per Machine for sub-50ms p95 target ### Fault Tolerance - Minimum 2 Machines per region - Automatic failover to next-closest region - Zero-downtime blue-green deployments (under 2 minutes) --- # FAQ ## What is Memory Box? Memory Box is a universal memory layer for AI applications. It gives any LLM persistent, semantic memory without relying on provider-native features. You own the memory, you control the context, and you hydrate any model with it on demand. ## How is this different from ChatGPT's memory or Claude's memory? Those are provider-locked features. If you build on ChatGPT's memory, your context lives in OpenAI's infrastructure and only works with OpenAI models. Memory Box is provider-agnostic. Switch from Claude to GPT to Gemini to open-source models—your agent's memories come with you. Your agent's identity lives in your infrastructure, not theirs. ## How is this different from RAG? RAG retrieves documents. Memory Box retrieves context about the user and the relationship. Traditional RAG answers "what does this document say?" Memory Box answers "what do I know about this person, this project, this conversation history?" It's the difference between a search engine and a memory. ## Why not just save chat history? Chat history is a log. Memory is meaning. Storing every message creates noise. Memory Box stores what matters—extracted facts, preferences, decisions, context—and retrieves it semantically. Your agent gets relevant context, not a transcript. ## What's the data model? - **Tenants** → Organizations or applications - **Agents** → Individual AI instances within a tenant - **Memories** → Semantic units of context with metadata Each agent has its own namespace. Memories are embedded for semantic search and can be filtered by metadata labels. ## How does search work? Three modes: - **Vector** – Pure semantic similarity. "Find memories about project planning." - **Hybrid** – BM25 keyword matching + semantic similarity with rank fusion - **Chronological** – Time-based retrieval. "What happened recently?" All modes support metadata filtering. ## What embedding model do you use? Qwen3-Embedding-8B with 1024 dimensions. The embedding is included in the API—you don't need to manage it yourself. ## Where is data stored? Vectors are stored in globally distributed cloud infrastructure across GCP and AWS regions. Data is encrypted at rest (AES-256) and cached in SSD/RAM tiers. Metadata and account data live in globally distributed edge databases. Enterprise customers can deploy in their own cloud accounts with customer-managed encryption keys (CMEK). ## Why inject memories at runtime instead of fine-tuning? Fine-tuning is expensive, brittle, and locks you to a specific model version. Every update means retraining. Memory-Enhanced In-Context Learning (ME-ICL) is different. The base model stays general. At runtime, you retrieve relevant memories and inject them into the context window. Updates are instant, no retraining cost, and it works with any model. ## Who can access my agent's memories? Only authenticated requests with valid tenant credentials and the correct agent ID. Memories are namespaced per-agent—there's no cross-agent access. ## Do you train on stored memories? No. Your memories are yours. We don't use them for training, analytics, or anything beyond serving them back to you. ## Can users delete their data? Yes. You can delete individual memories, or delete an entire agent (which wipes all their memories). Tenant-level deletion is also supported. --- # Security & Compliance ## Certifications - **SOC 2 Type II**: Certified. Infrastructure, processes, and controls are audited annually. - **GDPR**: Compliant. Full support for data subject rights. - **Right to Delete**: Supported. Complete removal within 30 days. ## Core Commitments - **No Training on Your Data**: Your memory data is never used to train models. - **We Never Read Your Data**: Memories are encrypted at rest and in transit. Our team has no access without explicit permission. - **We Never Sell Your Data**: Your data is not a product. ## Data Rights - **Right to Be Forgotten**: Delete all data via API or request - **Data Portability**: Export memories in standard formats - **Data Minimization**: We only collect what's necessary - **Retention Policies**: You control how long memories are stored --- # Quick Start 1. Get your API key from the Memory Box dashboard 2. Activate an agent: ```bash curl -X POST https://memory-box-api.fly.dev/api/v1/agents/my-agent/activate \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" ``` 3. Store a memory: ```bash curl -X POST https://memory-box-api.fly.dev/api/v1/memories \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: my-agent" \ -H "Content-Type: application/json" \ -d '{"text": "User prefers dark mode and concise responses"}' ``` 4. Search memories: ```bash curl -X GET "https://memory-box-api.fly.dev/api/v1/memories?q=user+preferences" \ -H "Authorization: Bearer $MEMORY_BOX_API_KEY" \ -H "X-Agent-ID: my-agent" ``` For more information, visit https://memorybox.dev/docs