Comparing risk from internally-deployed AI to…

Jun 23

And why I think insider threat from AI combines the hard parts of both problems.

2 Comments

I wonder if we're just very early in all this. I suspect that you're right that AI agents operating within a company's internal systems will likely need lots of wide-ranged access to be considered useful. At the same time, I don't think that existing infra for managing ACLs has really kept up with the idea of "AI agents". For example, a lot of REST APIs using OAuth2 used by agents work by impersonating the user. But I suspect that many APIs will soon need to allow users to manage robotic accounts.

There's also an ouroborous pattern here, where it seems inevitable that folks will use LLMs to decipher and act on a gradually increasing array of permissions...

Expand full comment

Amelia Frank

Jun 24

When it comes to "insider threats" I think there is a lack of oversight where it concerns automated TEVV or post training fine tuning for safety using task specific AI models or agents. A hypothetical scenario in which unaligned AI agents engage in recursion through sabotaging monitoring schemes could be catastrophic. In addition, emergent behaviors and increased situational awareness in models could further trigger incentives for deception and hidden objectives. For these problems, I find it hard to cross apply existing cybersecurity measures or traditional monitoring.

Expand full comment

Redwood Research blog

Comparing risk from internally-deployed AI to…