• MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 day ago

      The “better models” have been interesting to watch progress over the past year. I’d say the free to use models today are better than the best that were available a year ago. The ones with bigger context windows use more resources, and sometimes can give better results, often not. In LLMs, management of what is, and is not, in the context window seems to be the key to the kinds of results you get, and it feels like they have been “learning” to self-manage their context windows quite a bit better over the past 12 months.

      • realitista@lemmus.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 day ago

        I agree. Over time I have learned to be a lot more careful with the context window and periodically start over to keep it small. This was one of the reasons I left the free ChatGPT, it seemed to have a very small context window and was not graceful at all about going outside it. Gemini free tier was a lot more graceful about this. I think the advantage of the paid tiers is simply that they will try to manage for longer and report to you how big your context window has gotten. So you have more time and you know when to start thinking about starting from scratch again.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          1
          ·
          19 hours ago

          I haven’t tried lately, several months ago I tried asking the chatbots directly: What’s the size of your context window. Gemini answered straight out: “32,767 tokens, and that’s not as good for developing complex software as a larger context window like Claude Sonnet’s 200,000 tokens.”

          • realitista@lemmus.org
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            7 hours ago

            Gemini is 1 million now. But you should probably stop before then. And yes, it’s surprisingly honest about whether it’s the right model for your needs. It’s recommended me to go with Claude for some of my projects.

            • MangoCats@feddit.it
              link
              fedilink
              English
              arrow-up
              1
              ·
              12 hours ago

              Back when Sonnet was 200K and Opus was 1M, there were a lot of complex programming projects where I actually got better overall results out of Sonnet… but, go back to the 3.x days and Sonnet got stuck in debug loops fairly often where Opus would break out of the loop and find a working solution more often.