I saw Jensen at GTC talking about cuDF and cuVS - hope it's not a flash in the pan like their NIMs. They got a way to go till they are a good dev platform.
by john5022
|
Apr 15, 2026, 8:43:25 PM
My primary stack is Javascript (React/NodeJS/noSQL - Mongo a lot, Dynamo as a second, postgres because ... work). While all that was dandy for work, all this AI that rose in past 3-5 years is heavily centric to Python. I remember when assessing vectorstores and it was really frustrating - especiall trying to use Professional vectorstores like Pinecone.<p>Pinecone's early SDK support for Node.js was frustrating to say the least - mostly because I really liked their performance.<p>hnswlib is something I used a lot - I have done a fair bit of work in local/remote index building and benchmarking it - and taking any opensource tooling and building indexes locally was always way slower than say making a call into Pinecone. Anyhow, so due to work and personal interest, last 3-4 years have been in the datastore/vectorstore realm. I also have had access to some significant GPU compute. I am happy to talk shop with anybody who wants...<p>OK, now for the meat and potatoes: building indexes on CPUs is a complete nogo, my benchmarks of GPU/CPUs show numbers so comical that normally ppl think tests must be wrong.<p>NVIDIA cuVS is the library behind vector search in Elasticsearch, Weaviate, Milvus, and Oracle. It has bindings for Python, Rust, Java, Go. Nothing for Node.js. NVIDIA tried once with node-rapids in 2021, it seems to me it abandoned it in 2023. - <a href="https://github.com/rapidsai/node" rel="nofollow">https://github.com/rapidsai/node</a><p>So I built cuvs-node. Native C++ N-API bindings to the cuVS C API. There's five algorithms (CAGRA, IVF-Flat, IVF-PQ, brute-force, HNSW). 119 tests. Verified on A10, A100, H100, GH200, and B200.<p>I have a ton of benchmarks of GPU vs CPU - altho the very itneresting ones are among the providers. The difference in performance is actually shocking - despite most of them claiming state of the art infra.<p>Following benchmarks were completed in same session, on A100 SXM (same machine, GPU vs CPU):<p>1M vectors at 768 dimensions: 5.3s on GPU vs 65 minutes on CPU (hnswlib-node).
733x faster. Search: sub-2ms at 1M vectors.<p>Open source, Apache 2.0. Requires Linux with NVIDIA GPU and CUDA. Prebuilt binaries on the roadmap.
Happy to answer questions.
by gmatt
|
Apr 15, 2026, 8:43:25 PM