Discussion about this post

User's avatar
Andre Kramer's avatar

I've added an analysis from OpenAI ChatGPT (o3-mini) pointing out what this scheme does not cover. You can also ask an LLM about the advantages of this scheme compared to a monolithic agent where all the safety guarantees are within the one model. I think this is a good way to motivate AI safety by using AI!

Expand full comment

No posts