Released by OpenAI. A new paper Outlining some of the progress it has made in overcoming the common problem of illusions where AI just makes things up. This paper outlines two models, called outcome monitoring and process monitoring, for demystifying delusions and how they play out.
Along with monitoring results, OpenAI trains reward models to provide feedback on the final results of the AI. Along with process monitoring, the reward model provides feedback at every step of the way, creating a human-like chain of thought.
In its research paper, OpenAI tested both models on a mathematical dataset and found that the process monitoring method led to “significantly better performance”. It is important to note that the process monitoring method has so far only been tested in the mathematics domain and further work will be required to see how it performs in general.
Explaining the potential consequences of process monitoring methods, OpenAI said:
“If these results generalize, we may find that process monitoring gives us the best of both worlds—an approach that is more performant and more adaptive than outcome monitoring.”
It’s too early to say how much this phased validation will help address hallucinations more generally, but hopefully it will because hallucinations are probably the number one problem with LLMs. Just this week, a lawyer who used ChatGPT for his work submitted false information detailing bogus cases dreamed up by AI.
OpenAI has not given a timeline for how long it will take to implement process monitoring in ChatGPT that is publicly available. It is still in the research phase and needs to be tested on general information.
Although initial results are good, OpenAI notes that performance can be reduced by safe methods known as alignment tax. Results so far show that monitoring processes while working on math problems is not taxed, but we do not know what will happen to more general information.