An AI safety research team at the company recently evaluated how methods for documenting digital manipulation are faring against today’s most worrying AI developments, like interactive deepfakes and widely accessible hyperrealistic models. It then recommended technical standards that can be adopted by AI companies and social media platforms.
To understand the gold standard that Microsoft is pushing, imagine you have a Rembrandt painting and you are trying to document its authenticity. You might describe its provenance with a detailed manifest of where the painting came from and all the times it changed hands. You might apply a watermark that would be invisible to humans but readable by a machine. And you could digitally scan the painting and generate a mathematical signature, like a fingerprint, based on the brush strokes. If you showed the piece at a museum, a skeptical visitor could then examine these proofs to verify that it’s an original.
All of these methods are already being used to varying degrees in the effort to vet content online. Microsoft evaluated 60 different combinations of them, modeling how each setup would hold up under different failure scenarios—from metadata being stripped to content being slightly altered or deliberately manipulated. The team then mapped which combinations produce sound results that platforms can confidently show to people online, and which ones are so unreliable that they may cause more confusion than clarification.
The company’s chief scientific officer, Eric Horvitz, says the work was prompted by legislation—like California’s AI Transparency Act, which will take effect in August—and the speed at which AI has developed to combine video and voice with striking fidelity.
“You might call this self-regulation,” Horvitz told MIT Technology Review. But it’s clear he sees pursuing the work as boosting Microsoft’s image: “We’re also trying to be a selected, desired provider to people who want to know what’s going on in the world.”
Nevertheless, Horvitz declined to commit to Microsoft using its own recommendation across its platforms. The company sits at the center of a giant AI content ecosystem: It runs Copilot, which can generate images and text; it operates Azure, the cloud service through which customers can access OpenAI and other major AI models; it owns LinkedIn, one of the world’s largest professional platforms; and it holds a significant stake in OpenAI. But when asked about in-house implementation, Horvitz said in a statement, “Product groups and leaders across the company were involved in this study to inform product road maps and infrastructure, and our engineering teams are taking action on the report’s findings.”
It’s important to note that there are inherent limits to these tools; just as they would not tell you what your Rembrandt means, they are not built to determine if content is accurate or not. They only reveal if it has been manipulated. It’s a point that Horvitz says he has to make to lawmakers and others who are skeptical of Big Tech as an arbiter of fact.
“It’s not about making any decisions about what’s true and not true,” he said. “It’s about coming up with labels that just tell folks where stuff came from.”


