A GitHub issue that has taken off on Hacker News right now describes something many in the AI dev community recognize: Claude Code, which has long been the go-to tool for serious coding with AI assistance, is said to have become significantly worse after the February updates. Not just a little worse. In fact, unusable for certain complex tasks, according to those who are most vocal.
What makes this thread worth following is that the complainers are not beginners who don't understand the tool. They are people with a track record in software engineering who point to very specific regressions — the model is said to have become more cautious, more reserved, and generally worse at maintaining context through long, complex workflows. Exactly what makes a coding tool actually useful in practice.
This is not the first time we see this pattern. OpenAI really took a hit in 2023-2024 when users noticed that GPT-4 was lobotomized over time — and it took a long time before the company admitted that RLHF tuning had made the model more "safe" at the expense of capability. The question now is whether Anthropic has fallen into the same trap with its safety or cost optimizations.
The context makes this extra interesting. Benchmarks like SWE-bench Verified still show impressive numbers for the Claude models, and Claude Code (Opus 4.6) leads on the more contamination-resistant SWE-rebench with 52.9 percent. But benchmarks and actual user experience are two very different things — something this thread illustrates quite clearly. You can score well on isolated problems and still be frustrating to work with over an entire workday.
For those who use Claude Code professionally, the signal here is that it might be worth testing the alternatives again. The competition from Cursor, GitHub Copilot, and similar is not dormant. And for Anthropic's part: a community that turns quickly and has 667 points behind it on HN is not something to ignore.
Remember: These are early signals from community sources — one GitHub issue and one HN thread. It is not peer-reviewed research. But when the volume and technical precision in the complaints are so high, it is worth following closely in the coming days.
