OpenRouter shows that multiple smaller models working together surpass frontier performance

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 12 days ago

OpenRouter shows that multiple smaller models working together surpass frontier performance

LittleFellaNamedBoof [any]@hexbear.net · 11 days ago

I’ve actually got a theory around what is going to be the standard in like 10 years. Basically I think everything will be local and the way it will be structured is you will have a single “Supervisor” model that manages a plethora of small specialist models. It will load and unload them as needed from system memory and act as the go-between for the user and the model cluster. That way these can run on really low amounts of memory. Like say you have 10 models all need 10GB of ram. You’d need 100GB to run something like that now. But if you had another model that could take that 10GB, reserve it, and then assign it to the 10 as needed and unload and reload them on demand you’d only need the 10 itself. While keeping the same amount of performance.

I think the reason the US doesn’t have a chance in the AI race is because philisophically they just don’t function like this. They’re all about power. That’s why they build datacenters. More compute = more better AI in their mind. But efficiency is the name of the game in anything. If you can do the same thing with 10x fewer resources you’ll win every time. It’s the same reason Iran just beat the US. Sustainable, efficient, and structural thinking.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 11 days ago

Yeah that’s exactly what I’m expecting as well. The real difference in philosophy is that Americans companies treat the model as the product, while Chinese companies see models at infrastructure you build products on top of. You amortize the cost of deploying it at scale by sharing knowledge and iterating quickly to bring the cost down.

Models themselves are general purpose tools, so it’s not where the money is going to be long term. There’s a reason everybody isn’t rolling their own operating systems for example. It makes a lot more sense to treat models as shared infrastructure everyone contributes to. We’ll likely converge on a handful of common architectures because there’s really not much difference between them at the end of the day. Everybody curating their own model is a huge duplication of effort with no clear benefit.

If you treat the model as the product, then it makes sense to keep it closed. You have some secret sauce that nobody else has, and you sell it. But the reality is that nobody has a magic formula that’s significantly better than what other people can figure out. You might get an advantage for a few months tops, and then other models start catching up.

And this creates involution where you just have a race to the bottom where nobody makes any money. On the other hand, if you treat models as infrastructure, and everybody contributes to the same pool of knowledge, then you amortize the cost of making a better model. The money comes from actual products that can genuinely differentiate themselves. Companies are going to seek niches they can dominate where they do a specific thing really well. That’s a much more realistic path towards long term sustainability.

And continuing to work in the open with the rest of the world means getting the benefit of having a global community of researchers helping advance this tech forward. It’s not just altruism or clout. American companies working on closed models have to foot the bill for all the research, and they’re limited to the brainpower within the company while they’re competing with Chinese companies which have much bigger research community contributing to developing their models.

If the model itself is not the product, then American companies find themselves in a situation where they’re spending a ton of resources on something that’s not their core business.

LittleFellaNamedBoof [any]@hexbear.net · 11 days ago

Yeah the only American company that I think is going to come out of this AI hype cycle unscathed is Apple. They’re the only one not burning their cash reserves like crazy and will be poised to take advantage once the data center roll out proves to have been a bad bet.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 11 days ago

Also, Apple seems to be the only US company to have realized that this tech will likely move to edge devices in a few years and they’ve been designing stuff for running models locally. As local models become the norm, they could see a huge boost in sales if other manufacturers can’t catch up.

LittleFellaNamedBoof [any]@hexbear.net · 11 days ago

deleted by creator

Kereru [he/him]@hexbear.net · 11 days ago

Isn’t this just mixture-of-experts architecture?

GiorgioBoymoder [she/her]@hexbear.net · 11 days ago

damn the efficiency gains just keep coming don’t they?

jackmaoist [none/use name]@hexbear.net · 11 days ago

Smaller specialised models that can run locally should be the future. Claudes pricing is insane and unfeasible when they actually decide to make profit.

OpenRouter shows that multiple smaller models working together surpass frontier performance

OpenRouter shows that multiple smaller models working together surpass frontier performance

Surpassing Frontier Performance with Fusion — OpenRouter Blog