Ukraine Vectorizes 33.7M Court Decisions Using Voyage AI for Semantic Legal Search
A Ukrainian legal tech team is embedding the country's entire open-access court decision registry, EDRSR, into a vector database to enable semantic search for lawyers. The project uses Voyage AI's voyage-3.5 model to convert court rulings into 1024-dimensional vectors stored in a self-hosted Qdrant instance on AWS EC2. The database already holds over 44 million vectors across criminal, civil, commercial, and misdemeanor case types, with civil cases — the largest cohort at 33.7 million documents — currently 42% complete. Documents are chunked into segments of up to 2,048 characters to improve retrieval quality, since individual court rulings can run up to 200,000 characters. Once civil case processing is finished, the collection is expected to exceed 63 million vectors, making it roughly 100 times larger than a typical RAG deployment.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in