From Serverless to Self-Hosted: Migrating DevMate to Neo4j and Pgvector

TL;DR: To optimize for cost and latency in DevMate, I migrated the GraphRAG backend from AWS Neptune and OpenSearch to a self-hosted Neo4j and Pgvector stack running on Dockerized EC2 instances. This reduced cloud overhead by 60% and improved retrieval latency by keeping the data orchestration layer closer to the compute.

The Setup: Building DevMate

DevMate started as a code intelligence platform for enterprise clients. The initial architecture was pure AWS serverless: Neptune for graph queries and OpenSearch for vector search. Clean, managed, and... expensive.

After three months in production, the monthly bill was eye-watering. It was time to rethink the stack.

The Original Stack: Serverless Dreams

AWS Neptune + OpenSearch

✅ Fully managed—no operational overhead
✅ Auto-scaling built in
✅ Tight AWS integration (IAM, VPC, etc.)
❌ Neptune: $0.348/hr minimum (db.r5.large) = $250+/month idle
❌ OpenSearch: $0.244/hr minimum = $175+/month idle
❌ Data transfer between services adds up fast

The Migration: Docker on EC2

The insight was simple: for a B2B product with predictable traffic, serverless scaling wasn't worth the premium. I migrated to:

Neo4j Community Edition in Docker (graph database)
PostgreSQL + Pgvector in Docker (vector search)
Both running on a single m6i.xlarge EC2 Reserved Instance

💡 The Math: Neptune + OpenSearch = ~$500/month minimum. EC2 Reserved m6i.xlarge + EBS = ~$180/month. 60% cost reduction.

The "Cloud vs. Docker" Trade-off

Serverless databases (Neptune, OpenSearch) are excellent for scaling to millions of users. But for a B2B SaaS with predictable traffic, the high monthly minimums and "cold start" latency become bottlenecks, not features.

Key insight: By running Neo4j and Pgvector as Docker containers on the same EC2 instance as the FastAPI backend, we eliminated network hops between services. RAG queries that previously took 200-300ms now complete in 50-80ms.

Why Neo4j Over Neptune?

Neptune uses a subset of Gremlin/SPARQL. Neo4j uses Cypher. For code analysis queries, Cypher is dramatically more readable:

// Find all functions that call a specific API
MATCH (caller:Function)-[:CALLS]->(api:ExternalAPI {name: "stripe"})
RETURN caller.name, caller.file_path

Plus: Neo4j's neo4j-graphrag library integrates vector search natively, reducing the "glue code" between systems.

Why Pgvector Over OpenSearch?

OpenSearch is powerful but overkill for pure vector similarity. Pgvector:

✅ Native PostgreSQL—no new query language
✅ HNSW indexes with 95%+ recall
✅ Combine vector search with SQL JOINs in one query
✅ Fraction of the memory footprint

⚠️ War Story: The migration took 2 weeks. The hardest part? Rewriting Neptune's traversal queries in Cypher. Pro tip: start with the most complex query first—if that works, the rest is easy.

When to Stay Serverless

This migration made sense because DevMate has predictable B2B traffic. If you have:

Highly variable traffic (0 to 10K users)
No DevOps capacity to manage containers
Strict compliance requirements (SOC2, HIPAA)

...then managed services are worth the premium. Know your trade-offs.

Learn more about GraphRAG architecture in my AI Engineer Roadmap 2026.