Is your favorite AI chatbot scheming against you? If "AI scheming" sounds ominous, you should know that OpenAI is actively studying this phenomenon. This week, OpenAI published a study conducted ...
Working out whether an AI is secretly doing things we don’t want it to do is central to deciding if the increasingly powerful systems we are building are safe. To date, one of the main ways of doing ...
Alignment and safety, OpenAI argues, need to move as quickly as capability. An AI model wants you to believe it can't answer how many grams of oxygen are in 50.0 grams of aluminium oxide (Al₂O₃). When ...