ameliaquining 18 hours ago

Suggest changing title to "GPT-5 was briefly (ostensibly) available via API", since it's no longer available.

saarth28 18 hours ago

All I care about is getting gpt-4.1 quality for gpt-4.1-nano pricing.

consumer451 a day ago

TL;DR: gpt-5-bench-chatcompletions-gpt41-api-ev3

edit: it's been taken down now.

I haven't been trying to oneshot frontend stuff lately, but this looks pretty good, right? Can anyone better informed comment on that, without seeing the code?

https://old.reddit.com/r/OpenAI/comments/1mettre/gpt5_is_alr...

  • fouronnes3 a day ago

    AGI is when model names become weirder than user agent strings.

    • diggan 20 hours ago

      Considering that Llama's license dictate you should prefix your model name with "Llama" if you create, train, fine tune, or "otherwise improve" an AI model based on Llama, it might happen sooner than expected :) Fast-forward a couple of generations with their own similar license and maybe we'll end up with `LlamaRamaPastureForageGrazer/20280415`

    • layer8 20 hours ago

      I’m waiting for the model called “dwim-final”.

    • fragmede 20 hours ago

      Hilariously, you can ask ChatGPT about OpenAI's model names, what they mean, and then have it suggest better ones. And they are.

      • aleph_minus_one 15 hours ago

        > Hilariously, you can ask ChatGPT about OpenAI's model names, what they mean, and then have it suggest better ones. And they are.

        A shower thought after I read this:

        Thus, Sam Altman should be fired as CEO of OpenAI, amd ChatGPT should become OpenAI's new CEO - as the final solution to the "war of succession" that plagued OpenAI in November 2023.

        :-)

  • philipwhiuk a day ago

    help help I need an AGI to understand LLM strings, recursion detected.

Tiberium a day ago

del

  • consumer451 a day ago

    "Generate an SVG of a pelican riding a bicycle" is pretty impressive though.

    https://old.reddit.com/r/OpenAI/comments/1mettre/gpt5_is_alr...

    • megaloblasto a day ago

      It's a strange looking pelican just overlaid onto a mechanically illiterate version of a bike and the comments are like "the world isn't ready for this".

      • andy99 a day ago

        The comment is

          The details of the bike geometry and how it has a deep understanding of how the pelican would accurately use it is actually mind boggling, not sure society is ready for this
        
        It's pretty clearly making fun of people hyping up new LLM releases.
        • animex 21 hours ago

          aka Sam "What have we done?!" Altman.

          • Mountain_Skies 20 hours ago

            It's morbidly impressive how much Sam Altman is a sociopath's sociopath, knowing the right things to say to ensnare his fellow sociopaths into his trap.

            • diggan 3 hours ago

              > is a sociopath's sociopath

              I think these days we just call those people CEOs.

      • consumer451 a day ago

        It's related to the history of Simon Willison[0] having used this as a benchmark on many models.[1]

        I believe this model's output is noticeably superior... but yeah, people do tend to get hyperbolic when new stuff happens it their domain of interest.

        [0] https://news.ycombinator.com/user?id=simonw

        [1] https://www.google.com/search?q=simon+willison+pelican+ridin...

        • ruszki 16 hours ago

          And nowadays a better known benchmark, so data scientists can overfit their models to it even more, even when LLMs are famous for overfitting. So, I wouldn’t trust any results regarding this specific test nowadays.

        • littlestymaar a day ago

          > I believe this model's output is noticeably superior

          Sure, but at the same time Qwen3-30B-A3-2507 is also doing much better than most older models, even the bigger — and more capable — so I don't know how much is due to actual progress and how much is a new version of benchmaxxing.

      • williamdclt a day ago

        this comment points out the same things as you. It's (not-so-obvious but pretty clearly in hindsight) sarcasm

        • thevillagechief a day ago

          I thought it was sarcasm, then got confused because people seemed to take it seriously. So I decided to try the prompt on Gemini 2.5: Pro just says it can't generate an SVG, Flash generates a petty great one. Whatever copilot is using is also good. So I just assume even the image generated is a joke? People are starting to make me doubt my abilities to identify sarcasm.

          • consumer451 a day ago

            I believe the user who posted the image also included the api call snippet in another comment, so I took it as genuine. However, I feel your pain.

      • vonneumannstan 21 hours ago

        What is this supposed to be a test of. Actual Image models are unbelievably cracked at correct physics...

      • siva7 a day ago

        Can you do it better?

        • stockresearcher a day ago

          You know how in the old days, people used to think that the T.Rex prowled the earth in a very upright fashion, with her tail on the ground and head in the air? And in modern times, we believe that this was all wrong. The T.Rex walked with the tail off the ground, essentially level with the head. Right? People point and laugh if you make a drawing of a T.Rex with the tail on the ground and the head in the air.

          Well, anyone who has ever been to the ocean and seen a pelican in real life knows that its orientation on the bicycle is completely wrong. In flight, when its weight is supported by its wings, yes, that is probably how it would look. When on the ground, with its weight supported by its feet. NO.

          And if you've seen a pelican on the boardwalk interacting with humans or human-made things, you'd believe that a pelican on a bike would have its neck extended vertically, with its head held high. The wings would be on the handlebars.

          Speaking of handlebars, both a pelican and a bike are 3-dimensional objects. Pelican beaks are narrow. Much narrower than handlebars. Even hipster fixie handlebars are at least 5x wider than a pelican beak. In a drawing of a pelican riding a bike, the pelican overlays the bike in some spots and the bike overlays the pelican in others.

          Anyway, simonw's "pelican on a bike" series is a vector showing progress, but that vector isn't pointing in the right direction.

          • sho_hn 19 hours ago

            This comment made me crave a human "Pelican on a bike" competition.

      • valianteffort a day ago

        The dumbest among us tend to be the most in awe of mundane technology.

        • hombre_fatal 21 hours ago

          I’d say it’s the opposite. The dumbest don’t have the faculties to appreciate technology. It’s treated as inevitable and immediately becomes another modern fixture we take for granted in our life like a baby using an ipad.

    • Tiberium a day ago

      Yes, I tested the wrong version on accident :(

      • consumer451 a day ago

        Heh, I was wondering. Haven't had a moment to set it up in my LibreChat yet. But, I thought I saw reasoning in some of the reddit comments.

    • croes a day ago

      The pelican doesn’t look like a pelican and it looked like two images stacked on top of each other.

      If GPT 4 couldn’t do that, than GPT 5 isn’t impressive but GPT 4 is underwhelming.

      What about images, not SVGs, of clocks that show times different than 10 past 10?

    • tmaly a day ago

      it would be interesting if it could use a diffusion model to generate the bitmap then a different model to convert that bitmap to vector format. This could be an interesting way to reason about animations.

thrawa8387336 a day ago

Guys it has existed for at least two years