Z.ai Launches GLM 5.2: Open-Source 744B Model With 1M-Token Context Window

·1 views

Z.ai released GLM 5.2 on June 13, 2026, a 744-billion-parameter Mixture-of-Experts model featuring a 1-million-token context window and an MIT open-source license with no regional restrictions. The model currently ranks fourth among 124 models on BenchLM's provisional leaderboard and is the top-ranked open-weight model across three major long-horizon coding benchmarks, placing alongside proprietary frontier models. A new architectural feature called IndexShare reduces per-token compute by 2.9x at long context lengths, while an improved multi-token-prediction layer boosts speculative-decoding acceptance by around 20%. Priced at roughly one-sixth the cost of leading frontier models, GLM 5.2 is positioned as a cost-effective option, though experts warn that its large context window can drive up API costs significantly if developers send unnecessarily large prompts. Teams are advised to track input token usage per request and send only the minimum context required, rather than defaulting to filling the full 1-million-token window on every call.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

M.Sc. Student Builds Affordable ATS Resume Scanner Starting at €3.99/Month

A master's student developed ClearScan, an ATS resume scanning tool, after finding existing options like Jobscan too expensive during his own job search. The platform allows users to paste a resume and job description to see how closely they match, with transparent scoring that explains the results. ClearScan launched today and has already attracted its first paying customers. A free tier offers two scans per month, while paid plans start at €3.99/month, deliberately priced with students in mind. The tool is now live at clearscan.fyi, with the developer actively seeking feedback from users familiar with ATS screening challenges.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Qualcomm buys Modular for $4B as Cursor acquires Continue and funding rounds surge

The commercial open-source software space saw a wave of consolidation this week, with Qualcomm agreeing to acquire AI infrastructure startup Modular for nearly $4 billion. Cursor, the AI-powered code editor, quietly acquired Continue, an open-source alternative to GitHub Copilot. Elastic also expanded its AI portfolio by acquiring site reliability engineering startup Deductive AI for up to $85 million. On the funding side, DeepSeek closed a $7 billion-plus round, Timefold secured a $13 million Series A for its scheduling optimization platform, and Moonshot AI is reportedly seeking a $30 billion valuation in new funding talks. Additionally, Sentient Foundation pledged $42 million to advance open-source AGI, while Vercel launched an open-source agentic framework called eve and Daytona announced a shift to closed source.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Dev fixes OpenAI Assistants API timeout errors by making limits configurable

A developer discovered that their production AI assistant was crashing with timeout errors during a live client demo due to a hardcoded 60-second polling limit. The OpenAI runs were not actually failing — they were simply taking longer than expected as session history grew, causing the app to quit too early. The fix involved moving the timeout value to an environment variable and extending it to 150 seconds, while also updating the polling loop to handle all five terminal run states. The developer noted that AI workload durations vary significantly by session length, making hardcoded limits unreliable in real-world use. The update was deployed successfully, eliminating the timeout errors in production.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

PromptOT Lets Teams Manage LLM Prompts as Versioned Blocks Without Code Deploys

A developer built PromptOT after a production bug traced back to an undocumented one-line edit inside a 200-line hardcoded prompt caused a support bot to promise incorrect refund timelines. The platform breaks monolithic prompt strings into typed, independently versioned blocks covering role, context, instructions, guardrails, and output format. Each block can be toggled, edited, and rolled back separately via a dashboard, without requiring a new code deployment. Apps retrieve the compiled prompt through a simple REST API call, with variables resolved at fetch time. PromptOT also ships an MCP server with 23 tools, allowing AI assistants like Claude to draft and save prompt changes directly in chat; a free tier supporting three projects and 1,000 API calls per month is available at promptot.com.

0 comments Read more at DEV Community