LLM Adversarial Inputs

Eval engineering: The missing piece of agentic AI governance

As artificial intelligence agents become more powerful, agentic AI governance becomes increasingly important – and yet, today ...

25d

Monitoring LLM behavior: Drift, retries, and refusal patterns

The offline pipeline's primary objective is regression testing — identifying failures, drift, and latency before production. Deploying an enterprise LLM feature without a gating offline evaluation ...

Data Security Considerations For Building Enterprise AI Agents

Organizations need to internalize a simple principle: Calling an LLM API is a data transfer. You're trusting the provider ...

Harvard Business School

Certifying LLM Safety Against Adversarial Prompting

Kumar, Aounon, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, and Himabindu Lakkaraju. "Certifying LLM Safety Against Adversarial Prompting." Conference on Language Modeling (2024).

InfoWorld

Protecting LLM applications with Azure AI Content Safety

New tools for filtering malicious prompts, detecting ungrounded outputs, and evaluating the safety of models will make generative AI safer to use. Both extremely promising and extremely risky, ...

Popular Science

Researchers found a command that could ‘jailbreak’ chatbots like Bard and GPT

Add Popular Science (opens in a new tab) More information Adding us as a Preferred Source in Google by using this link indicates that you would like to see more of our content in Google News results.

VentureBeat

Defending SOCs Under Siege: Battling Adversarial AI Attacks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More With 77% of enterprises already victimized by adversarial AI attacks and ...

InfoWorld

How to test large language models

Companies investing in generative AI find that testing and quality assurance are two of the most critical areas for improvement. Here are four strategies for testing LLMs embedded in generative AI ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results