News | National
31 Jul 2025 10:11
NZCity News
NZCity CalculatorReturn to NZCity

  • Start Page
  • Personalise
  • Sport
  • Weather
  • Finance
  • Shopping
  • Jobs
  • Horoscopes
  • Lotto Results
  • Photo Gallery
  • Site Gallery
  • TVNow
  • Dating
  • SearchNZ
  • NZSearch
  • Crime.co.nz
  • RugbyLeague
  • Make Home
  • About NZCity
  • Contact NZCity
  • Your Privacy
  • Advertising
  • Login
  • Join for Free

  •   Home > News > National

    How do you stop an AI model turning Nazi? What the Grok drama reveals about AI training

    AI developers have many levers they can use to steer chatbots into certain behaviours.

    Aaron J. Snoswell, Senior Research Fellow in AI Accountability, Queensland University of Technology
    The Conversation


    Grok, the artificial intelligence (AI) chatbot embedded in X (formerly Twitter) and built by Elon Musk’s company xAI, is back in the headlines after calling itself “MechaHitler” and producing pro-Nazi remarks.

    The developers have apologised for the “inappropriate posts” and “taken action to ban hate speech” from Grok’s posts on X. Debates about AI bias have been revived too.

    But the latest Grok controversy is revealing not for the extremist outputs, but for how it exposes a fundamental dishonesty in AI development. Musk claims to be building a “truth-seeking” AI free from bias, yet the technical implementation reveals systemic ideological programming.

    This amounts to an accidental case study in how AI systems embed their creators’ values, with Musk’s unfiltered public presence making visible what other companies typically obscure.

    What is Grok?

    Grok is an AI chatbot with “a twist of humor and a dash of rebellion” developed by xAI, which also owns the X social media platform.

    The first version of Grok launched in 2023. Independent evaluations suggest the latest model, Grok 4, outpaces competitors on “intelligence” tests. The chatbot is available standalone and on X.

    xAI states “AI’s knowledge should be all-encompassing and as far-reaching as possible”. Musk has previously positioned Grok as a truth-telling alternative to chatbots accused of being “woke” by right-wing commentators.

    But beyond the latest Nazism scandal, Grok has made headlines for generating threats of sexual violence, bringing up “white genocide” in South Africa, and making insulting statements about politicians. The latter led to its ban in Turkey.

    So how do developers imbue an AI with such values and shape chatbot behaviour? Today’s chatbots are built using large language models (LLMs), which offer several levers developers can lean on.

    What makes an AI ‘behave’ this way?

    Pre-training

    First, developers curate the data used during pre-training – the first step in building a chatbot. This involves not just filtering unwanted content, but also emphasising desired material.

    GPT-3 was shown Wikipedia up to six times more than other datasets as OpenAI considered it higher quality. Grok is trained on various sources, including posts from X, which might explain why Grok has been reported to check Elon Musk’s opinion on controversial topics.

    Musk has shared that xAI curates Grok’s training data, for example to improve legal knowledge and to remove LLM-generated content for quality control. He also appealed to the X community for difficult “galaxy brain” problems and facts that are “politically incorrect, but nonetheless factually true”.

    We don’t know if these data were used, or what quality-control measures were applied.

    Fine-tuning

    The second step, fine-tuning, adjusts LLM behaviour using feedback. Developers create detailed manuals outlining their preferred ethical stances, which either human reviewers or AI systems then use as a rubric to evaluate and improve the chatbot’s responses, effectively coding these values into the machine.

    A Business Insider investigation revealed xAI’s instructions to human “AI tutors” instructed them to look for “woke ideology” and “cancel culture”. While the onboarding documents said Grok shouldn’t “impose an opinion that confirms or denies a user’s bias”, they also stated it should avoid responses that claim both sides of a debate have merit when they do not.

    System prompts

    The system prompt – instructions provided before every conversation – guides behaviour once the model is deployed.

    To its credit, xAI publishes Grok’s system prompts. Its instructions to “assume subjective viewpoints sourced from the media are biased” and “not shy away from making claims which are politically incorrect, as long as they are well substantiated” were likely key factors in the latest controversy.

    These prompts are being updated daily at the time of writing, and their evolution is a fascinating case study in itself.

    Guardrails

    Finally, developers can also add guardrails – filters that block certain requests or responses. OpenAI claims it doesn’t permit ChatGPT “to generate hateful, harassing, violent or adult content”. Meanwhile, the Chinese model DeepSeek censors discussion of Tianamen Square.

    Ad-hoc testing when writing this article suggests Grok is much less restrained in this regard than competitor products.

    The transparency paradox

    Grok’s Nazi controversy highlights a deeper ethical issue: would we prefer AI companies to be explicitly ideological and honest about it, or maintain the fiction of neutrality while secretly embedding their values?

    Every major AI system reflects its creator’s worldview – from Microsoft Copilot’s risk-averse corporate perspective to Anthropic Claude’s safety-focused ethos. The difference is transparency.

    Musk’s public statements make it easy to trace Grok’s behaviours back to Musk’s stated beliefs about “woke ideology” and media bias. Meanwhile, when other platforms misfire spectacularly, we’re left guessing whether this reflects leadership views, corporate risk aversion, regulatory pressure, or accident.

    This feels familiar. Grok resembles Microsoft’s 2016 hate-speech-spouting Tay chatbot, also trained on Twitter data and set loose on Twitter before being shut down.

    But there’s a crucial difference. Tay’s racism emerged from user manipulation and poor safeguards – an unintended consequence. Grok’s behaviour appears to stem at least partially from its design.

    The real lesson from Grok is about honesty in AI development. As these systems become more powerful and widespread (Grok support in Tesla vehicles was just announced), the question isn’t whether AI will reflect human values. It’s whether companies will be transparent about whose values they’re encoding and why.

    Musk’s approach is simultaneously more honest (we can see his influence) and more deceptive (claiming objectivity while programming subjectivity) than his competitors.

    In an industry built on the myth of neutral algorithms, Grok reveals what’s been true all along: there’s no such thing as unbiased AI – only AI whose biases we can see with varying degrees of clarity.

    The Conversation

    Aaron J. Snoswell previously received research funding from OpenAI in 2024–2025 to develop new evaluation frameworks for measuring moral competence in AI agents.

    This article is republished from The Conversation under a Creative Commons license.
    © 2025 TheConversation, NZCity

     Other National News
     31 Jul: 8 policies that would help fight poverty in South Africa’s economic hub Gauteng
     31 Jul: China’s arrests of boys’ love authors does not equate to a ‘gay erotica’ crackdown
     31 Jul: Searchers have put more than 750 hours into the hunt for Roy Arbon - missing on the West Coast
     30 Jul: Auckland Emergency Management's opening four Civil Defence Centres for residents living on vessels
     30 Jul: Tsunami warnings are triggering mass evacuations across the Pacific – even though the waves look small. Here’s why
     30 Jul: A woman has been charged with threats of harm to people or property - after schools and Hastings hospital went into lockdown last Wednesday
     30 Jul: UK to recognise Palestinian statehood unless Israel agrees to ceasefire – here’s what that would mean
     Top Stories

    RUGBY RUGBY
    Lewis Clareburt has made tonight's final of the 200-metre medley at the swimming world championships in Singapore after a semi-final in which Leon Marchand broke the world record - Erika Fairweather finished sixth in the 200-metres freestyle More...


    BUSINESS BUSINESS
    It's not only airlines facing price rises, as the air travel sector grapples to recover from Covid-19 More...



     Today's News

    Environment:
    Kamchatka earthquake is among top 10 strongest ever recorded. Here’s what they have in common 10:07

    Cricket:
    England captain Ben Stokes will miss the fifth cricket test against India due to a right shoulder injury 10:07

    Entertainment:
    Lzzy Hale relished being part of Ozzy Osbourne's final gig 9:51

    Environment:
    Tsunami threat remains in Chile and Ecuador as other countries downgrade alerts 9:27

    Politics:
    The oil and gas exploration ban is expected to be repealed today 9:27

    Entertainment:
    Tyra Banks' Victoria's Secret Angel Card has been declined at a store in Los Angeles 9:21

    International:
    The NYC gunman who killed four left a note to 'study my brain' for CTE. What does that mean? 8:57

    International:
    Donald Trump targets India, Brazil and copper in another round of tariffs 8:37

    National:
    8 policies that would help fight poverty in South Africa’s economic hub Gauteng 8:27

    Rugby:
    Lewis Clareburt has made tonight's final of the 200-metre medley at the swimming world championships in Singapore after a semi-final in which Leon Marchand broke the world record - Erika Fairweather finished sixth in the 200-metres freestyle 8:17


     News Search






    Power Search


    © 2025 New Zealand City Ltd