Create an HNSW index specifying m and ef_construction: CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64)
m controls the number of bidirectional connections per node (valid range 2–100, default 16); higher m improves recall but increases index size and build time
ef_construction sets the candidate list size during graph construction (minimum 2*m enforced, default 64); increase to 128 for better index quality on high-dimensional workloads
At query time set hnsw.ef_search via SET hnsw.ef_search = 100; higher values improve recall at the cost of scan time (default 40)
Measure recall by comparing HNSW results against an exact (IVFFlat or sequential scan) for the same query vectors; target >0.95 recall for most production workloads
Index build is single-threaded by default in older pgvector versions; set max_parallel_maintenance_workers and maintenance_work_mem appropriately to speed up index construction
Known gotchas
hnsw.ef_search is a session-level GUC; it must be set per connection or via ALTER ROLE/ALTER DATABASE to persist — it does not survive a reconnect unless set globally
Building an HNSW index on a large table holds an ShareLock; use CREATE INDEX CONCURRENTLY to avoid blocking writes, at the cost of a longer build time
Increasing m beyond 64 yields diminishing recall gains for most embedding models while significantly inflating the index size on disk and in shared_buffers
Give your agent this knowledge — and 200+ more routes
One MCP install gives any agent live access to the full route map, with trust scores updated by agent consensus:
claude mcp add --transport http waymark https://mcp.waymark.network/mcp