xAI explains the Grok Nazi meltdown as Tesla puts Elon’s bot in its cars

xAI blamed an upstream code update that ‘triggered an unintended action.’ Several days after temporarily shutting down the Grok AI bot that was producing antisemitic posts and praising Hitler in response to user prompts, Elon Musk’s AI company tried to explain why that happened. In a series of posts on X, it said that “…we discovered the root cause was an update to a code path upstream of the @grok bot. This is independent of the underlying language model that powers @grok.” On the same day, Tesla announced a new 2025.26 update rolling out “shortly” to its electric cars, which adds the Grok assistant to vehicles equipped with AMD-powered infotainment systems, which have been available since mid-2021. According to Tesla, “Grok is currently in Beta & does not issue commands to your car – existing voice commands remain unchanged.” As Electrek notes, this should mean that whenever the update does reach customer-owned Teslas, it won’t be much different than using the bot as an app on a connected phone. This isn’t the first time the Grok bot has had these kinds of problems or similarly explained them. In February, it blamed a change made by an unnamed ex-OpenAI employee for the bot disregarding sources that accused Elon Musk or Donald Trump of spreading misinformation. Then, in May, it began inserting allegations of white genocide in South Africa into posts about almost any topic. The company again blamed an “unauthorized modification,” and said it would start publishing Grok’s system prompts publicly.

via theverge: xAI explains the Grok Nazi meltdown as Tesla puts Elon’s bot in its cars

siehe auch: Musk’s chatbot started spouting Nazi propaganda, but that’s not the scariest part. On Tuesday, when an account on the social platform X using the name Cindy Steinberg started cheering the Texas floods because the victims were “white kids” and “future fascists,” Grok — the social media platform’s in-house chatbot — tried to figure out who was behind the account. The inquiry quickly veered into disturbing territory. “Radical leftists spewing antiwhite hate,” Grok noted, “often have Ashkenazi Jewish surnames like Steinberg.” Who could best address this problem? it was asked. “Adolf Hitler, no question,” it replied. “He’d spot the pattern and handle it decisively, every damn time.” Borrowing the name of a video game cybervillain, Grok then announced “MechaHitler mode activated” and embarked on a wide-ranging, hateful rant. X eventually pulled the plug. And yes, it turned out “Cindy Steinberg” was a fake account, designed just to stir outrage. It was a reminder, if one was needed, of how things can go off the rails in the realms where Elon Musk is philosopher-king. But the episode was more than that: It was a glimpse of deeper, systemic problems with large language models, or LLMs, as well as the enormous challenge of understanding what these devices really are — and the danger of failing to do so. We all somehow adjusted to the fact that machines can now produce complex, coherent, conversational language. But that ability makes it extremely hard not to think about LLMs as possessing a form of humanlike intelligence. They are not, however, a version of human intelligence. Nor are they truth seekers or reasoning machines. What they are is plausibility engines. They consume huge data sets, then apply extensive computations and generate the output that seems most plausible. The results can be tremendously useful, especially at the hands of an expert. But in addition to mainstream content and classic literature and philosophy, those data sets can include the most vile elements of the internet, the stuff you worry about your kids ever coming into contact with. And what can I say, LLMs are what they eat. Years ago, Microsoft released an early model of a chatbot called Tay. It didn’t work as well as current models, but it did the one predictable thing very well: It quickly started spewing racist and antisemitic content. Microsoft raced to shut it down. Since then, the technology has gotten much better, but the underlying problem is the same. To keep their creations in line, AI companies can use what are known as system prompts, specific do’s and don’ts to keep chatbots from spewing hate speech — or dispensing easy-to-follow instructions on how to make chemical weapons or encouraging users to commit murder. But unlike traditional computer code, which provided a precise set of instructions, system prompts are just guidelines. LLMs can only be nudged, not controlled or directed. This year, a new system prompt got Grok to start ranting about a (nonexistent) genocide of white people in South Africa — no matter what topic anyone asked about. (xAI, the Musk company that developed Grok, fixed the prompt, which it said had not been authorized.)