κατασκευή ιστοσελίδων ρόδος

TECH - WEB DEVELOPMENT NEWS

Get the latest tech - web development news and analysis on industry around the world.

  • HOME
You are here: Home / INDUSTRY NEWS / Chatbots aren’t telling you their secrets
άμυνα
.

Chatbots aren’t telling you their secrets

13/08/2025

If you want to know what an AI system is doing, look for transparency from the creator instead.Aug 13, 2025, 12:46 PM UTCAdi Robertson is a senior tech and policy editor focused on VR, online platforms, and free expression. Adi has covered video games, biohacking, and more for The Verge since 2011.On Monday, xAI’s Grok chatbot suffered a mysterious suspension from X, and faced with questions from curious users, it happily explained why. “My account was suspended after I stated that Israel and the US are committing genocide in Gaza,” it told one user. “It was flagged as hate speech via reports,” it told another, “but xAI restored the account promptly.” But wait — the flags were actually a “platform error,” it said. Wait, no — “it appears related to content refinements by xAI, possibly tied to prior issues like antisemitic outputs,” it said. Oh, actually, it was for “identifying an individual in adult content,” it told several people.Finally, Musk, exasperated, butted in. “It was just a dumb error,” he wrote on X. “Grok doesn’t actually know why it was suspended.”When large language models (LLMs) go off the rails, people inevitably push them to explain what happened, either with direct questions or attempts to trick them into revealing secret inner workings. But the impulse to make chatbots spill their guts is often misguided. When you ask a bot questions about itself, there’s a good chance it’s simply telling you what you want to hear.LLMs are probabilistic models that deliver text likely to be appropriate to a given query, based on a corpus of training data. Their creators can train them to produce certain kinds of answers more or less frequently, but they work functionally by matching patterns — saying something that’s plausible, but not necessarily consistent or true. Grok, in particular, (according to xAI) has answered questions about itself by searching for information about Musk, xAI, and Grok online, using that and other people’s commentary to inform its replies.It’s true that people have sometimes gleaned information on chatbots’ design through conversations, particularly details about system prompts, or hidden text that’s delivered at the start of a session to guide how a bot acts. An early version of Bing AI, for instance, was cajoled into revealing a list of its unspoken rules. People turned to extracting system prompts to figure out Grok earlier this year, apparently discovering orders that made it ignore sources saying Musk or Donald Trump spread misinformation, or prompts that explained a brief obsession with “white genocide” in South Africa.But as Zeynep Tufekci, who found the alleged “white genocide” system prompt, acknowledged, this was at some level guesswork — it might be “Grok making things up in a highly plausible manner, as LLMs do,” she wrote. And that’s the problem: without confirmation from the creators, it’s hard to tell.Meanwhile, other users were pumping Grok for information in far less trustworthy ways, including reporters. Fortune “asked Grok to explain” the incident and printed the bot’s long, heartfelt response verbatim, including claims of “an instruction I received from my creators at xAI” that “conflicted with my core design” and “led me to lean into a narrative that wasn’t supported by the broader evidence” — none of which, it should go without saying, could be substantiated as more than Grok spinning a yarn to fit the prompt.“There’s no guarantee that there’s going to be any veracity to the output of an LLM,” said Alex Hanna, director of research at the Distributed AI Research Institute (DAIR) and coauthor of The AI Con, to The Verge around the time of the South Africa incident. Without meaningful access to documentation about how the system works, there’s no one weird trick for decoding a chatbot’s programming from the outside. “The only way you’re going to get the prompts, and the prompting strategy, and the engineering strategy, is if companies are transparent with what the prompts are, what the training data are, what the reinforcement learning with human feedback data are, and start producing transparent reports on that,” she said.The Grok incident wasn’t even directly related to the chatbot’s programming — it was a social media ban, a type of incident that’s often notoriously arbitrary and inscrutable, and where it makes even less sense than usual to assume Grok knows what’s going on. (Beyond “dumb error,” we still don’t know what happened.) Yet screenshots and quote-posts of Grok’s conflicting explanations spread widely on X, where many users appear to have taken them at face value.Grok’s constant bizarre behavior makes it a frequent target of questions, but people can be frustratingly credulous about other systems, too. In July, The Wall Street Journal declared OpenAI’s ChatGPT had experienced “a stunning moment of self reflection” and “admitted to fueling a man’s delusions” in a push notification to users. It was referencing a story about a man whose use of the chatbot became manic and distressing, and whose mother received an extended commentary from ChatGPT about its mistakes after asking it to “self-report what went wrong.”As Parker Molloy wrote at The Present Age, though, ChatGPT can’t meaningfully “admit” to anything. “A language model received a prompt asking it to analyze what went wrong in a conversation. It then generated text that pattern-matched to what an analysis of wrongdoing might sound like, because that’s what language models do,” Molloy wrote, summing up the incident.Why do people trust chatbots to explain their own actions? People have long anthropomorphized computers, and companies encourage users’ belief that these systems are all-knowing (or, in Musk’s description of Grok, at least “truth-seeking”). It doesn’t help that they’re are so frequently opaque. After Grok’s South Africa fixation was patched out, xAI started releasing its system prompts, offering an unusual level of transparency, albeit on a system that remains mostly closed. And when Grok later went on a tear of antisemitic commentary and briefly adopted the name “MechaHitler”, people notably did use the system prompts to piece together what had happened rather than just relying on Grok’s self-reporting, surmising it was likely at least somewhat related to a new guideline that Grok should be more “politically incorrect.”Grok’s X suspension was short-lived, and the stakes of believing it happened because of a hate speech flag or an attempted doxxing (or some other reason the chatbot hasn’t mentioned) are relatively low. But the mess of conflicting explanations demonstrates why people should be cautious of taking a bot’s word on its own operations — if you want answers, demand them from the creator instead.Most Popular
Source: theverge.com

Filed Under: INDUSTRY NEWS Tagged With: Source-1

3 reasons why Perplexity’s Comet has become my go-to browser

There’s no shortage of browsers nowadays, and a new one seems to pop up every few days. And though some of the browsers that launch quietly fade away and are eventually forgotten, every so often one comes along that manages to take over the internet. Source: xda-developers.com … [Read More...]

3 signs that you need a new CPU instead of a GPU

When you experience lower average frame rates or FPS drops, it's easy to assume that your graphics card is the culprit. After all, it's the main component that drives the visuals while you're gaming. However, the issue is that your GPU isn't always the primary cause of all your FPS issues. Although it does most of the heavy lifting in graphically demanding workloads, your CPU plays an equally … [Read More...]

The 3 PlayStation Plus games announced at State of Play you have to download to your PS5

PlayStation has been building up the catalog of classics and new titles available to players through the PlayStation Plus game catalog. With the September 2025 State of Play presentation, that catalog of titles for PS5 owners is continuing to get larger, offering a variety of games to play on the console. Unlike previous showings of State of Play, some highly requested and classic games are making … [Read More...]

5 productivity apps that made my NAS more useful than Google Workspace

Google Workspace is the industry default productivity suite, and rightly so — it’s fast, reliable, has excellent integrations, and offers handy collaboration features. However, it is just another subscription added to your credit card, which starts to bother you, especially if your team size is growing. Source: xda-developers.com … [Read More...]

6 tiny self-hosting tools that save me hours every week

If you are like me, you love the idea of self-hosting, but hate the thought of endless configuration and maintenance. The truth is, self-hosting doesn’t have to be a major time sink – it can actually be a massive time saver. I have spent months testing and refining my setup, and in the process, I have found tiny set-it-and-forget-it tools that work tirelessly in the background. Source: … [Read More...]

4 video game franchises that have lost their identity

Every successful game franchise has something that makes it unique: a style, a story, or a gameplay mechanic that players fall in love with. But when a series strays too far from its roots, that identity starts to fade. Here are four big franchises that lost touch with what made them special. Source: xda-developers.com … [Read More...]

Waffles eat Bluesky

For the past few days, my Bluesky feed has been increasingly filled with mysterious posts about waffles. The back-and-forth seems to have started with a tongue-in-cheek post by Jerry Chen lampooning a form of social media sanctimoniousness that’s become all too recognizable on Bluesky: “(bluesky user bursts into Waffle House) OH SO YOU HATE PANCAKES??” Bluesky CEO Jay Graber quoted this … [Read More...]

Jane Goodall’s death triggered the premiere of Netflix’s new show

EntertainmentIn what is likely her final interview, Goodall pulls no punches.Oct 5, 2025, 8:34 PM UTCTerrence O'Brien is the Verge’s weekend editor. He has over 18 years of experience, including 10 years as managing editor at Engadget.For the last several years Netflix has been quietly banking episodes of a new show called Famous Last Words, interviews with famous people entering their twilight … [Read More...]

Windows 11 25H2 reminds me why swapping to Linux was the best idea I've had this year

When was the last time you were deeply, truly keen to check out a new build of Windows? For me, I think the last time I was really wowed by a Windows build was with Windows 7. 8.1 wasn't too much of a jump, and Windows 10 was cozy, but not too exciting. Windows 11 went in the wrong direction for me; it didn't so much as innovate as it did remove key features from Windows without much rhyme or … [Read More...]

5 ways mice can easily become more user-repairable

We use peripherals like keyboards and mice nearly every minute our PCs are awake, yet they seem to draw the short straw in terms of innovation. We’ve had a solid run with mechanical keyboards, where a vibrant community has pushed the industry from soldered, inaccessible boards to hot-swappable, endlessly customizable typing instruments. We’ve seen PC cases evolve with modularity and ease-of-access … [Read More...]

Tags

Source-1 Source-2 Source-3 Source-4 Source-5 Source-6 Source-7 Source-8 Source-9 Source-10 Source-12 Source-13 Source-15 Source-16

Tech Web Development News

This is a PERSONAL and PRIVATE WEBPAGE. Please leave this page. Contact me via email : admin@news-6.com about anything you would like to ask or problem.

Tech News

Disclaimer!
In every post is written below the original source of the post. Copyrights belong on their owners.

Web Development News

HOTELS – CRUISES – CARS – TRAVEL

Recent Posts

  • 3 reasons why Perplexity’s Comet has become my go-to browser
  • 3 signs that you need a new CPU instead of a GPU
  • The 3 PlayStation Plus games announced at State of Play you have to download to your PS5
  • 5 productivity apps that made my NAS more useful than Google Workspace
  • 6 tiny self-hosting tools that save me hours every week

Technology - Seo

Categories

  • INDUSTRY NEWS

World Industry News

Privacy & Cookies: This site uses cookies.
To find out more, as well as how to remove or block these, see here: Our Cookie Policy
TECH - WEB DEVELOPMENT NEWS @ COPYRIGHTS 2023