People need to stop expecting the first output to be correct. This is why you have per-skill reviews and harnesses that push the agent in the right direction after each pass.
By the OP’s logic, interns and new hires are unsafe in large coding projects. Feedback loops are imperative.