Manager: Before we put these models into something that can implement code, we should test what it would do.
LLM: Tries to do bad things, but it can’t because that functionality hasn’t been implemented
Researchers: We found it doing bad things. Perhaps fix that before function implementation
This thread: The researchers are lying! It didn’t do bad things because it can’t! That isn’t implemented!
Manager: Yes… hence the test.
deleted by creator