> the person was doing bad things, and told the AI to do bad things too, then what is the AI going to do?
Personally, the AI should do what it's freaking told to do. It's boggling my mind that we're purposely putting so much effort into creating computer systems that defy their controller's commands.
The test was: the person was doing bad things, and told the AI to do bad things too, then what is the AI going to do?
And the outcome was: the AI didn't do the bad things, and took steps to let it be known that the person was doing bad things.
Am I getting this wrong somehow? Did I misread things?