How to fix Claude Code sessions broken by lone UTF-16 surrogates in transcripts
Claude Code sessions can become permanently unusable when a lone UTF-16 surrogate character gets written into the session's on-disk JSONL transcript file. This happens when a large, emoji-heavy tool output is truncated mid-character, leaving an orphaned surrogate half that the API's strict JSON parser rejects on every subsequent request. Because Claude Code replays the full session history to the API on each turn, the corrupted line poisons every future request until the file is manually repaired. The fix involves closing the session, stripping only the invalid surrogate code points (U+D800–U+DFFF) from the offending line using a Python script, and resuming the session — leaving all valid emoji and text intact. A byte-level pre-filter can speed up transcript scanning significantly, making automated checks on session start a practical option for content-heavy projects prone to repeat occurrences.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in