The entire business model of AI companies is that it makes mistakes so you have to keep using it to fix the mistakes it creates

jaykrown@lemmy.world · 2 days ago

The entire business model of AI companies is that it makes mistakes so you have to keep using it to fix the mistakes it creates

MangoCats@feddit.it · 1 day ago

The “better models” have been interesting to watch progress over the past year. I’d say the free to use models today are better than the best that were available a year ago. The ones with bigger context windows use more resources, and sometimes can give better results, often not. In LLMs, management of what is, and is not, in the context window seems to be the key to the kinds of results you get, and it feels like they have been “learning” to self-manage their context windows quite a bit better over the past 12 months.

realitista@lemmus.org · edit-2 1 day ago

I agree. Over time I have learned to be a lot more careful with the context window and periodically start over to keep it small. This was one of the reasons I left the free ChatGPT, it seemed to have a very small context window and was not graceful at all about going outside it. Gemini free tier was a lot more graceful about this. I think the advantage of the paid tiers is simply that they will try to manage for longer and report to you how big your context window has gotten. So you have more time and you know when to start thinking about starting from scratch again.

MangoCats@feddit.it · 20 hours ago

I haven’t tried lately, several months ago I tried asking the chatbots directly: What’s the size of your context window. Gemini answered straight out: “32,767 tokens, and that’s not as good for developing complex software as a larger context window like Claude Sonnet’s 200,000 tokens.”

realitista@lemmus.org · edit-2 8 hours ago

Gemini is 1 million now. But you should probably stop before then. And yes, it’s surprisingly honest about whether it’s the right model for your needs. It’s recommended me to go with Claude for some of my projects.

MangoCats@feddit.it · 13 hours ago

Back when Sonnet was 200K and Opus was 1M, there were a lot of complex programming projects where I actually got better overall results out of Sonnet… but, go back to the 3.x days and Sonnet got stuck in debug loops fairly often where Opus would break out of the loop and find a working solution more often.