Every OpenAI release elicits awe and anxiety as capabilities advance, evident in Sora’s strikingly realistic AI-generated video clips that went viral while unsettling industries reliant on original footage. But the company is again being secretive in all the wrong places about AI that can be used to spread misinformation.

As usual, OpenAI won’t talk about the all-important ingredients that went into this new tool, even as it releases it to an array of people to test before going public. Its approach should be the other way around. OpenAI needs to be more public about the data used to train Sora, and more secretive about the tool itself, given the capabilities it has to disrupt industries and potentially elections. OpenAI Chief Executive Officer Sam Altman said that red-teaming of Sora would start on Thursday, the day the tool was announced and shared with beta testers. Red-teaming is when specialists test an AI model’s security by pretending to be bad actors who want to hack or misuse it. The goal is to make sure the same can’t happen in the real world. When I asked OpenAI how long it would take to run these tests on Sora, a spokeswoman said there was no set length. “We will take our time to assess critical areas for harms or risks,” she added.

The company spent about six months testing GPT-4, its most recent language model, before releasing it last year. If it takes the same amount of time to check Sora, that means it could become available to the public in August, a good three months before the US election. OpenAI should seriously consider waiting to release it until after voters go to the polls. […] OpenAI is meanwhile being frustratingly secretive about the source of the information it used to create Sora. When I asked the company about what datasets were used to train the model, a spokeswoman said the training data came “from content we’ve licensed, and publicly available content.” She didn’t elaborate further.