Samkhya v1.0 lets developers plug Claude, GPT-4o-mini, or Ollama into SQL query optimizers
Developer Pratech Singh has released Samkhya v1.0, an open-source Rust library that connects large language models such as Anthropic Claude, OpenAI GPT-4o-mini, and local Ollama instances to the cardinality-estimation layer of embedded SQL engines including DataFusion, DuckDB, and Polars. The project addresses a longstanding problem in analytical databases where learned statistics — such as histograms, Bloom filters, and HyperLogLog sketches — are discarded when a process ends, forcing the optimizer to relearn from scratch each session. Samkhya ships as a 13-crate Rust workspace under an Apache-2.0 license, with reference server implementations in both Python FastAPI and Node TypeScript. A built-in safety mechanism ensures that a hallucinating LLM cannot produce a query plan worse than the engine's native estimate. Benchmarks show a 1.038× geometric-mean wallclock improvement over unmodified DataFusion 46 on the JOB-Slow workload, though end-to-end LLM latency figures are currently projected rather than fully measured.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in