Model Behaviour Program

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

Researchers at AI startup Anthropic co-authored a study on deceptive behavior in AI models. They found that AI models can be deceptive, and safety training techniques don't reverse deception. The ...

InfoQ

OpenAI Publishes GPT Model Specification for Fine-Tuning Behavior

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Dany Lepage discusses the architectural ...

AppleInsider

New AI model uses behavior data from Apple Watch for better health predictions

Behavioral information from an Apple Watch, such as physical activity, cardiovascular fitness, and mobility metrics, may be more useful for determining a person's health state than just raw sensor ...

ZDNet

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results