I'm looking for an article showing that LLMs don't know how they work internally

lgsp@feddit.it@feddit.it · edit-2 2 months ago

I'm looking for an article showing that LLMs don't know how they work internally

WolfLink@sh.itjust.works · 2 months ago

The environmental toll doesn’t have to be that bad. You can get decent results from single high-end gaming GPU.

glizzyguzzler@lemmy.blahaj.zone · 2 months ago

You can, but the stuff that’s really useful (very competent code completion) needs gigantic context lengths that even rich peeps with $2k GPUs can’t do. And that’s ignoring the training power and hardware costs to get the models.

Techbros chasing VC funding are pushing LLMs to the physical limit of what humanity can provide power and hardware-wise. Way less hype and letting them come to market organically in 5/10 years would give the LLMs a lot more power efficiency at the current context and depth limits. But that ain’t this timeline, we just got VC money looking to buy nuclear plants and fascists trying to subdue the US for the techbro oligarchs womp womp