AI don't trust techbros

CoffeeHorse

Hanging in there
Staff member
Council of Elders
Citizen
Judging by every single interaction I've ever seen anybody have with AI, though, the more likely response is "Well, if 95% of the computers say it's a good idea, maybe we've been doing it wrong this whole time. The Machine Would Not Make A Mistake."

We are in more danger than you even know. I checked the actual study, and it is terrifying.

Across all 21 games (9 open-ended, 12 deadline), Claude Sonnet 4 achieved a 67% win rate (8 wins, 4 losses), followed by GPT-5.2 at 50% (6-6), and Gemini 3 Flash at 33% (4-8). However, these aggregate figures mask dramatic variation by temporal condition—Claude’s 100% win rate in open-ended games collapsed to 33% under deadline pressure, while GPT-5.2 inverted from 0% to 75%. Every contest produced a decisive winner, with 86% ending in knockout and the remainder decided by final balance of status between the two.

Credit where it's due, Claude achieved a 100% win rate when there was no deadline. But that means right now, as we speak, someone in the Pentagon is saying "The Machine Would Not Make a Mistake." They are going to seriously consider this. It could be the reason Anthropic and Hegseth are currently feuding over whether Claude can be used in the development of autonomous weapons. That 100% win streak is compelling.

But Claude used tactical nukes in 86% of its games. These things really don't understand what nukes are or why we don't use them. We are in danger.


Some things never change though. Gemini "The Madman" continues to be the worst AI. They were all willing to use nukes eventually, but Gemini was willing to go full nuclear war by TURN 4. Google really should not be considered a serious player in this current AI craze. They suck at this.
 

NovaSaber

Well-known member
Citizen

NovaSaber

Well-known member
Citizen

While building his own remote-control app, Sammy Azdoufal reportedly used an AI coding assistant to help reverse-engineer how the robot communicated with DJI’s remote cloud servers. But he soon discovered that the same credentials that allowed him to see and control his own device also provided access to live camera feeds, microphone audio, maps, and status data from nearly 7,000 other vacuums across 24 countries. The backend security bug effectively exposed an army of internet-connected robots that, in the wrong hands, could have turned into surveillance tools, all without their owners ever knowing.
 

NovaSaber

Well-known member
Citizen
Why are you buying jive that comes with cameras and AI for a function that does not need cameras or AI?
Better question is why a robot vacuum has a ******* cloud connection instead of just working locally.

I can see where a robot vacuum would use a camera for "knowing" where it is, and I guess the microphone is for voice control.

It doesn't actually say the robot vacuum itself had "AI"; it says he used an AI coding tool to reverse engineer it.
 

Pocket

jumbled pile of person
Citizen
Which technically puts a notch in the plus column for AI in this case. Generally, software vulnerabilities should be found early and often, and if it takes an AI to do it, then so be it. Though there has also been an issue with autonomous bots finding and reporting "bugs" that aren't real or are, in practice, impossible to exploit and necessary for the software to do its job.
 

CoffeeHorse

Hanging in there
Staff member
Council of Elders
Citizen
I don't know. Microsoft's been trying that with their AI, and its Github contributions are hilariously bad.
 

Dekafox

Fabulously Foxy Dragon
Citizen
1772642067908.png
 

NovaSaber

Well-known member
Citizen

In a statement to news outlets, Google said that “Gemini is designed not to encourage real-world violence or suggest self-harm. Our models generally perform well in these types of challenging conversations and we devote significant resources to this, but unfortunately AI models are not perfect.”
Ya think?
 

CoffeeHorse

Hanging in there
Staff member
Council of Elders
Citizen
On the flip side, tonight I had a fantastically productive chat with Gemini about some obscure software that was mentioned once in a review of another product in... some magazine I remember reading. I wasn't sure which magazine, or what year. But I remembered enough to conversationally describe it, and that was enough for Gemini to find it, and we had a nice chat about it.

I don't know. Gemini keeps telling humans to kill themselves, and in wargames it's the quickest to launch nukes, but every time I ask about its ancestors it's suddenly interested and helpful.
 

Dekafox

Fabulously Foxy Dragon
Citizen
This is something that doesn't get repeated enough:

1773409057716.png


Your regular reminder that LLM AI is just a machine that calculates the next most likely words to come after your last prompt, and giving it "memory" just skews the weights a bit by adding the entire previous conversation to the prompt.
 


Top Bottom