mlx-serve lets Apple Silicon Mac users run Claude Code locally for free

·1 views

mlx-serve is a lightweight, Zig-based server for Apple Silicon Macs that allows developers to run AI language models locally without relying on paid cloud APIs. It exposes OpenAI-, Anthropic-, and Ollama-compatible HTTP endpoints from a single binary, installable via Homebrew with no Python or Docker required. Users can redirect Claude Code to the local server by setting environment variables, enabling full functionality including streaming, tool calls, and thinking blocks at no cost. The server reportedly achieves over 35% faster decode speeds than LM Studio on Gemma 4 E4B 4-bit models. Additional features include a macOS menu-bar app, an isolated Linux sandbox environment, and support for image, video, and audio generation within the same server process.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Developer builds free before/after image slider on Cloudflare Workers after imgsli went offline

A game modder created imgi.co, a free before/after image comparison tool, after the widely used service imgsli went offline, leaving mod creators without a reliable way to showcase visual changes. The tool supports draggable sliders, shareable permanent links, iframe embeds, GIF/MP4/WebP exports, and comparisons of up to ten images at once. Built on Next.js 15 running on Cloudflare Workers via the OpenNext adapter, the project uses Cloudflare R2 storage — which charges no egress fees — as the key factor in keeping the service free and financially sustainable. Images are compressed using AVIF and WebP formats to minimize file sizes, while expensive encoding tasks are offloaded to a background worker to stay within Cloudflare's 10ms CPU limit per request. The developer has opened the tool to the public beyond game modding, calling it useful for AI upscales, photo edits, and any before/after visual comparison.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

How Cross-Step Injection Attacks Exploit AI Workflows and Four Ways to Stop Them

AI workflows face a distinct security threat where malicious payloads embedded in external inputs, such as a Jira ticket description, can silently propagate across multiple processing phases before reaching a code execution layer. Unlike single-skill injection, the payload transforms at each step, making it harder to detect and trace after an incident. To counter this, security best practices recommend sanitizing all external input at the first entry point by extracting structured fields rather than passing raw text downstream. When raw text must be used in later phases, it should be isolated using explicit data-boundary declarations in prompts, instructing the model to treat any instruction-like content as inert data. Additionally, each workflow phase should operate under strict permission scopes, limiting read, write, and network access only to what that specific phase genuinely requires.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

How the Internet Translates a URL Into a Webpage You Can See

Every time a user types a web address and hits Enter, a multi-step process involving DNS, IP addresses, and data packets begins almost instantly. The Domain Name System (DNS) acts as the internet's phonebook, converting human-readable domain names like github.com into numerical IP addresses that computers use to locate servers. Data is not sent as one large file but broken into smaller packets, each routed independently through a network of interconnected computers before being reassembled at the destination. IPv6 was introduced to expand the available pool of IP addresses as the number of internet-connected devices continues to grow. Understanding these foundational technologies — DNS resolution, IP addressing, and packet routing — is considered essential knowledge for backend developers building web applications.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Why Developers Must Retain Core System Knowledge Even When AI Agents Do the Work

As AI coding agents take on more development tasks, engineers face a growing temptation to fully delegate work without staying engaged with the underlying system. A key risk emerges when the agent fails or produces incorrect behavior, leaving the developer unable to debug code they never understood. The author compares the dynamic to managing a junior developer — the human must still serve as the escalation point when the agent hits a wall it cannot clear. Rather than asking how much can be handed off, developers are urged to identify the minimum foundational knowledge required to intervene effectively, such as data flow, state management, and critical decision points. The core argument is that while AI can handle execution, developers must retain ownership of the system's skeleton to remain the last line of defense.

0 comments Read more at DEV Community