Vercel, Google, and Mistral ship major AI infrastructure updates in same week

·1 views

Vercel's AI Gateway introduced firewall-style routing rules that let platform teams swap or block models at the credential level without changing application code, reducing model migration to a single config update. Google released Nano Banana 2 Lite, capable of generating 1,000 images in four seconds at low cost, alongside Omni Flash, which enables natural-language video editing within the same API pipeline but is limited to 10-second clips with no audio support. A new suite of MIT-licensed agentic coding models, ranging from 9B to 397B parameters, was released with reinforcement learning training optimized for both solution quality and search scaffolding, supporting 256K context windows. The smaller 9B dense model runs on a single 80GB GPU, making capable agentic coding accessible without multi-GPU infrastructure. Mistral also shipped two production-ready releases in the same cycle, with a text-to-speech offering among them, reflecting a broader industry push toward tighter control over model selection, credentials, and tool integrations.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

The Demoralization of the White-Collar Worker

Article URL: https://nooneshappy.com/article/the-demoralization-of-the-white-collar-worker/ Comments URL: https://news.ycombinator.com/item?id=48780469 Points: 3 # Comments: 0

0 comments Read more at Hacker News

ProgrammingDEV Community ·

In regulated lending, LLMs handle only a fraction of the AI pipeline

A DEV Community post highlights a common misconception about AI's role in complex, regulated financial workflows. In regulated lending, large language models handle only the final prose-drafting step, while the heavier work — document ingestion, data extraction, validation, and financial calculations — must be performed by deterministic code. Financial calculations in particular cannot be delegated to LLMs, which risk silently producing rounding errors or inaccurate outputs. Canada's OSFI E-21 guideline reinforces this design constraint, requiring human ownership of risk decisions. The author argues that winning AI teams in banking are those who precisely identify the narrow slice of the problem where a language model is actually appropriate.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Ukraine Vectorizes 33.7M Court Decisions Using Voyage AI for Semantic Legal Search

A Ukrainian legal tech team is embedding the country's entire open-access court decision registry, EDRSR, into a vector database to enable semantic search for lawyers. The project uses Voyage AI's voyage-3.5 model to convert court rulings into 1024-dimensional vectors stored in a self-hosted Qdrant instance on AWS EC2. The database already holds over 44 million vectors across criminal, civil, commercial, and misdemeanor case types, with civil cases — the largest cohort at 33.7 million documents — currently 42% complete. Documents are chunked into segments of up to 2,048 characters to improve retrieval quality, since individual court rulings can run up to 200,000 characters. Once civil case processing is finished, the collection is expected to exceed 63 million vectors, making it roughly 100 times larger than a typical RAG deployment.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

SecondLayer Maps Cost and Design of 860B Legal AI Trained on 2TB Ukrainian Law

Ukrainian legal-tech firm SecondLayer has outlined a hypothetical project to train a 860-billion-parameter Mixture-of-Experts AI model on approximately 2 terabytes of Ukrainian and European legal data hosted on Google Cloud Platform. The corpus includes 96.2 million full-text Ukrainian court decisions, public registries, annotated legislation, Supreme Court rulings, and Spanish and EU legal texts. After deduplication and cleaning, the usable training corpus is estimated at 800–1,000 GB, yielding roughly 280–330 billion tokens — about 50 times smaller than DeepSeek V3's original 14.8 trillion-token dataset. The proposed architecture mirrors DeepSeek V3, with 671 billion total parameters but only 37 billion active per token, making high-volume inference more cost-efficient than dense models. The exercise is presented as a technical breakdown of dataset composition, model architecture, compute costs, and the capabilities such a domain-specific legal model could deliver.

0 comments Read more at DEV Community