OpenAI will get caught vibe graphing

Throughout its huge GPT-5 livestream on Thursday, OpenAI confirmed off a couple of charts that made the mannequin appear fairly spectacular — however in the event you look carefully, some graphs have been a bit of bit off.

In a single, paradoxically exhibiting how effectively GPT-5 does in “deception evals throughout fashions,” the dimensions is all over. For “coding deception,” for instance, GPT-5 apparently will get a 50.0 p.c deception price, however that’s in comparison with OpenAI’s smaller 47.4 p.c o3 rating which by some means has a bigger bar.

Or this one, the place considered one of GPT-5’s scores is decrease than o3’s however is proven with an even bigger bar. On this similar chart, o3 and GPT-4o’s scores are completely different however proven with equally-sized bars. That chart was unhealthy sufficient that CEO Sam Altman commented on it, calling it a “mega chart screwup.” An OpenAI advertising and marketing staffer additionally apologized for the “unintentional chart crime.”

OpenAI didn’t instantly reply to a request for remark. And whereas it’s unclear if OpenAI used GPT-5 to truly make the charts, it’s nonetheless not an incredible search for the corporate on its huge launch day — particularly when it’s touting the “important advances in lowering hallucinations” with its new mannequin.

Source link

OpenAI will get caught vibe graphing

By12free

By 12free

Related Post

LG’s new Gallery TV, designed for displaying artwork, shall be at CES 2026

GOG’s Steam-alternative PC sport retailer is leaving CD Projekt, staying DRM-free

Flip your PC right into a Tremendous Nintendo with Epilogue’s new USB dock

Leave a Reply Cancel reply

You missed

LG’s new Gallery TV, designed for displaying artwork, shall be at CES 2026

GOG’s Steam-alternative PC sport retailer is leaving CD Projekt, staying DRM-free

Flip your PC right into a Tremendous Nintendo with Epilogue’s new USB dock

The way to tweak your on-line platform algorithms