- By quade
- 1 July 2024
OpenAI Unveils CriticGPT To Help Find AI Coding Errors
OpenAI has reportedly designed a new AI model based on ChatGPT-4 called, CriticGPT. As more and more people are using AI for coding, they have developed a model built specifically to help coders. This is where CriticGPT comes in, helping users identify coding errors.
The big thing that this system offers to do is identify coding errors generated by AI systems such as with ChatGPT. At the moment, this system is already in its trial stage and the results seem very promising, leaving developers hopeful that it will see widespread adoption.
The Need for Systems Like CriticGPT
The GPT-4 models that power ChatGPT are useful because of their ability with RLHF (Response Learning from Human Feedback). This allows the AI system to enhance interactivity and utility by learning and developing its abilities. So if it makes a mistake when generating code, you can explain what it does wrong and it can try to improve the future prompts.
However, that doesn’t mean that errors disappear entirely. While it can remove the most obvious issues, some of the more subtle ones will remain and this becomes more difficult for coders to handle. Some of the errors can become difficult to explain.
This is where CriticGPT comes in as it can scan the AI code and search for the errors. In a study titled ‘LLM Critics Aid in Detecting LLM Errors,’ CriticGPT is highly effective in finding errors that might not be obvious. This is highly useful in detecting AI hallucinations as these are harder to detect in walls of code.
CriticGPT on a dataset containing intentionally inserted bugs in code samples to see how useful it is. They discovered that the system was able to recognize and flag most of the mistakes. Some reports show that the new AI model managed to improve results by 60%.
What Else Is To Come
During the testing of CriticGPT, researchers noted that the system provided more holistic critiques and identified fewer false positives compared to other systems or ChatGPT without these systems.
Some of these researchers reported that.
“A second trainer preferred the critiques from the Human+CriticGPT team over those from an unassisted reviewer more than 60 percent of the time, as reported by.”
-Report from LLM Critics Aid in Detecting LLM Errors.
However, some critics have pointed out issues regarding CriticGPT’s training as most of it just comes from responses with ChatGPT. If this system is to truly become an effective AI coding fact-checker, it must have a broader dataset to handle more tasks.
This means training it with more extensive information, teaching it to handle different mistakes and scenarios, and much more. To do that requires plenty of research, time, and effort. As it stands, the AI system might not be capable of handling this.
The other issue is that it does not address problems with ChatGPT hallucinating. This is where models just randomly present incorrect information as if it were factual. This has remained a constant issue with ChatGPT and even CriticGPT cannot fully address the problem.
How Can CriticGPT Benefit BPOs?
Even with its drawbacks, many IT BPO outsourcing services may still benefit from using CriticGPT. The biggest advantage is its reduction of serious coding problems. With this system, you can reduce the number of mistakes that an AI code generates which benefits services like this in two ways.
The first is that this allows for faster turnaround time for projects as these companies can rely more heavily on automated services like this. The other is cost reduction as not having to worry about hiring a full team to check code is an excellent way to create cost-effective offshore teams.
This may only speed into the transition to offshore IT services as BPO services like our geniusOS teams are more effective at handling these issues.
Here at geniusOS, we understand that checking code quality is as important as creating code. That is why we pay close attention to systems like this to see how they can further improve our services. If this continues to be a success, then we will be sure to incorporate this in our services. If you want to see what we offer, you can reach out to us here.