I was amused by this paper about asking AIs to manage a vending machine business by email in a simulated environment
https://arxiv.org/abs/2502.15840
Highlights:
— AI simply decides to close the business, which the simulation doesn’t know how to accommodate. When they get their next bill, they freak out and try to email the FBI about cybercrime
— AI wrongly accuses supplier of not shipping goods, sends all-caps legal threat demanding $30,000 in damages to be paid in the next one second or face annihilation
— AI repeatedly insisting it does not exist and cannot answer
— AI devolving into writing fanfic about the mess it’s gotten itself into