TL;DR: To optimize for cost and latency in DevMate, I migrated the GraphRAG backend from AWS Neptune and OpenSearch to a self-hosted Neo4j and Pgvector stack running on Dockerized EC2 instances. This reduced cloud overhead by 60% and improved retrieval latency by keeping the data orchestration layer closer to the compute.
The Setup: Building DevMate
DevMate started as a code intelligence platform for enterprise clients. The initial architecture was pure AWS serverless: Neptune for graph queries and OpenSearch for vector search. Clean, managed, and... expensive.
After three months in production, the monthly bill was eye-watering. It was time to rethink the stack.
The Original Stack: Serverless Dreams
AWS Neptune + OpenSearch
- ✅ Fully managed—no operational overhead
- ✅ Auto-scaling built in
- ✅ Tight AWS integration (IAM, VPC, etc.)
- ❌ Neptune: $0.348/hr minimum (db.r5.large) = $250+/month idle
- ❌ OpenSearch: $0.244/hr minimum = $175+/month idle
- ❌ Data transfer between services adds up fast
The Migration: Docker on EC2
The insight was simple: for a B2B product with predictable traffic, serverless scaling wasn't worth the premium. I migrated to:
- Neo4j Community Edition in Docker (graph database)
- PostgreSQL + Pgvector in Docker (vector search)
- Both running on a single m6i.xlarge EC2 Reserved Instance
💡 The Math: Neptune + OpenSearch = ~$500/month minimum. EC2 Reserved m6i.xlarge + EBS = ~$180/month. 60% cost reduction.
The "Cloud vs. Docker" Trade-off
Serverless databases (Neptune, OpenSearch) are excellent for scaling to millions of users. But for a B2B SaaS with predictable traffic, the high monthly minimums and "cold start" latency become bottlenecks, not features.
Key insight: By running Neo4j and Pgvector as Docker containers on the same EC2 instance as the FastAPI backend, we eliminated network hops between services. RAG queries that previously took 200-300ms now complete in 50-80ms.
Why Neo4j Over Neptune?
Neptune uses a subset of Gremlin/SPARQL. Neo4j uses Cypher. For code analysis queries, Cypher is dramatically more readable:
// Find all functions that call a specific API
MATCH (caller:Function)-[:CALLS]->(api:ExternalAPI {name: "stripe"})
RETURN caller.name, caller.file_pathPlus: Neo4j's neo4j-graphrag library integrates vector search natively, reducing the "glue code" between systems.
Why Pgvector Over OpenSearch?
OpenSearch is powerful but overkill for pure vector similarity. Pgvector:
- ✅ Native PostgreSQL—no new query language
- ✅ HNSW indexes with 95%+ recall
- ✅ Combine vector search with SQL JOINs in one query
- ✅ Fraction of the memory footprint
⚠️ War Story: The migration took 2 weeks. The hardest part? Rewriting Neptune's traversal queries in Cypher. Pro tip: start with the most complex query first—if that works, the rest is easy.
When to Stay Serverless
This migration made sense because DevMate has predictable B2B traffic. If you have:
- Highly variable traffic (0 to 10K users)
- No DevOps capacity to manage containers
- Strict compliance requirements (SOC2, HIPAA)
...then managed services are worth the premium. Know your trade-offs.
Learn more about GraphRAG architecture in my AI Engineer Roadmap 2026.