Google Employees Internally Share Memes About How Its AI Sucks

Sahwa@reddthat.com · 2 days ago

Google Employees Internally Share Memes About How Its AI Sucks

brucethemoose@lemmy.world · edit-2 10 hours ago

Gemini actually has a really interesting architecture, hence it has fast responses, and it’s easily the best long context model out there.

And outside of bechmaxxing or pure coding, Gemma is very good for its size. 12B is an incredible multimodal LLm, the only one natively trained for image/text ingestion without a mmproj hacked on at the end.

…But it sure feels like executive meddling kills it.

The pattern I see is:

Gemini preview is released.
It’s genuinely good! It’s smart, it’s straight.
Then they “refine” it, it’s gets more and more sycophantic, more deep fried. Long context performance degrades… benchmark scores go up, but anyone who actually uses it can immediately tell it’s gotten worse.
Only then, is it released for mass use.

It’s obvious they took a good model, then enshittified it to make their bosses happy and tech bros in Twitter excited.

Gemma has the same pattern. Researchers tease the local community, delay it, and then when a new Gemma finally comes out, it turns out to be using some old SWA architecture. And the biggest model is cut. And only a smaller one uses the multimodal training.

It’s obvious it was neutered to not “threaten” Gemma API or be too “unsafe.”

Another thing I’ve noticed is that both Gemini and Gemma are awful with their default 1.0 temperature/top-p 0.95. Sampling completely screws them up. But they like low temperature + minp, and Gemma loves constrained sampling.

But 99% of users don’t know anything about sampling, so that’s going to leave a bad impression.

sexual_tomato@lemmy.dbzer0.com · 9 hours ago

What temperature/top_k do you use for gemma

brucethemoose@lemmy.world · edit-2 8 hours ago

I use sigma N sampling at 1.0, a slop phrase banlist, and maybe a little rep penalty.

Beyond that it depends on the usage.

For scripts or “questioning a document,” it’s as low as can be until it loops. I start with zero temperature. But I don’t really use Gemma for coding, TBH, and it’s not good for longer documents.

If it’s for a specific language or a very specific script, I sometimes constrain grammar for the language.

For more “general” writing, like brainstorming or RP or whatever, I start at around 0.7 with minimal DRY sampling and look at the logit percentages in the Mikupad UI. Especially “important” tokens like names or information recall. If the probability of getting correct answers is too low, I turn the temperature down.

…But honestly, I tend to use big MoEs instead of Gemma for that, too.

And if none of this makes any sense…

Yeah. That’s the problem.

Sampling was supposed to be a temporary stopgap until looping and such was figured out, but the big LLM devs just never addressed it in production. There are all sorts of interesting papers, including one from Google about sampling logits per-layer, but they don’t implement any of them in the API models.

Smith6612@lemmy.world · 20 hours ago

My favorite meme about Google AI is the one where it tries to justify that the pool of the Titanic is not full of water.

canadaduane@lemmy.ca · 22 hours ago

404media: “This post is for paid members only” But we’ll sure as hell put ads on it anyway.

kreskin@lemmy.world · edit-2 1 day ago

None of my management cares if AI agents work well, they just want to get them deployed asap. I dread the day they go into use. They will claim I have no engineering talent or something like that. I’m not sure malicious compliance will work this time but its worth a shot.

On the bright side its never too late to be a meth head salvaging copper from around town, and I know where a bunch of metal is at.

[object Object]@lemmy.ca · edit-2 2 days ago

This is too real.

Now I get PRs entirely written by Claude from my VP that include things like full plaintext secret keys, or reimplement an API that exists, just shittier.

“Claude wrote this in an hour, why is review taking so long”

Uhh because I can’t figure out the diplomatic way to say this is shit and you need to stop without creating an incident, and I don’t want to spend half my day reviewing crap.

Ilovethebomb@sh.itjust.works · 2 days ago

Have you tried asking them a bunch of technical questions they don’t know the answer to, until they give up?

whotookkarl@lemmy.dbzer0.com · 2 days ago

Or spending hours explaining in excruciating detail all the reasons why it’s shit and what they should have done instead, make sure to throw all the heavy handed certification standards and strict audit requirements and mind numbing bike shedding naming standards back at them.

pinball_wizard@lemmy.zip · 9 hours ago

Yes. This is the way.

I’m the VP’s ally. Practically their beat friends. It’s all these pesky regulations, lawyers, audits and extreme personal liability that is slowing both of us down from doing things the sociopath way…at least until I find a gig with a less sociopathic boss.

Pechente@feddit.org · 2 days ago

Yeah also noticing similar bullshit. People send me exact steps on what to do written by ChatGPT that understands exactly nothing about the context and is therefore often wrong or a half truth at best.

Another client has pushed a single commit to a messy project that added 70k lines and a load of new features. The project is now unmaintainable.

Taleya@aussie.zone · edit-2 10 hours ago

Our dev tried to send me a generated summary and code. My reply was “Yes, i’ve read your llm summary. You’re still missing the fact that the script has hardcoded the same ip into every single client and consequently doesn’t work”

webhead@lemmy.world · 23 hours ago

I am not even a developer but I’ve noticed tickets having a response written clearly by AI that miss several things I already talked to the person over teams about. Like dude read your own fucking comment before you post. The conclusion is wrong and you know that because we talked about it before you had the AI “figure out the problem” in the first place. Fuck. I know reading logs is really time consuming and annoying but the AI isn’t always very good or won’t just say “hey that log isn’t showing that I’m looking for” and instead just hallucinates something.

I don’t even hate AI, but could we at least use our fucking brains while using the AI? When it spits out code to me for my home projects, I, someone who is not a developer, still look at the code to make sure it’s not say running a loop that will hammer disk looking for 1200 files one at a time instead of pulling a directory listing and searching it or something very similar in the database I’m using. People have gotten so lazy. Maybe they’re tired of their bosses trying to force them and are providing garbage? I don’t know but can we just not? Lol.

Taleya@aussie.zone · 10 hours ago

I’ve had multiple people try and use LLMs to troubleshoot. It gives me a great feeling of job security. Those fuckers cannot think and in fact drove a boss to screen punching strokeout running him in a two hour circle over something i fixed with two clicks and cognitive function

webhead@lemmy.world · 8 hours ago

Sometimes it can help troubleshoot but you have to already know what you’re doing so you can filter out really stupid suggestions and get to the “oh yeah I didn’t think about that” kind of stuff. If you’re relying on it completely, you’re gonna have a bad time lol.

tristynalxander@mander.xyz · 2 days ago

I’m just glad people will write off my terrible code as AI.

MangoCats@feddit.it · 2 days ago

Instructions are sloppy, code can be sloppy, but what I find is: when they review code changes they find real stuff. Not all the real stuff, but more real stuff than human reviewers typically find. A code review doesn’t need to be perfect, not even 100% correct, it just needs to show you stuff that you look at and think “damn, good to catch this now instead of in a field problem report a year from now…”

BradleyUffner@lemmy.world · edit-2 1 day ago

That’s fine when you’re looking at 1 or 2 reviews. Now try sifting through hundreds a day.

MangoCats@feddit.it · 12 hours ago

We only do about 3-10 reviews a week, depending… it’s not there to replace you, it’s there to help.

Before AI assistance we would do fewer reviews, because the AI is finding things - real things worth fixing - now some reviews (the reviews of our colleagues who haven’t figured out how to use AI to review their pull requests before submitting them effectively) get recycled 2-3 times before they’re adequately cleaned up.

Documentation and requirements are better aligned with code, unit test coverage is better, and the developers who use AI to review their code before putting in a pull request generally are getting through on the first pass. You still have to read the documentation and requirements, review the code, but now it’s actually approaching accurate and complete much more closely than it used to.

Our team is small and diverse, some do embedded C, some do GUI oriented .NET, some do backend processing in Rust / Linux - we all know our domains and there is lots of value in the collective wisdom, but it doesn’t translate super easily or efficiently - AI is helping with that.

If you’ve got 100 pull requests to review every day - quit. Maybe stick around for the paycheck until you find something better, but that’s not a job, that’s a clusterbomb waiting to go off.

BradleyUffner@lemmy.world · 12 hours ago

I was referring to the people running open source projects that are receiving 100s of reviews per day from people just blasting outs PRs.

MangoCats@feddit.it · 10 hours ago

For this, we need to start using (much more) secure ID tech, so you really know who is submitting, and prioritize those who have made good quality submissions in the past. Sadly, this may negelect “unknown” authors, but such is life.

Also, we may need to recruit more code authors / wanna be code authors to act as code reviewers more of the time, perhaps following the model we use in our commercial operation where all authors also act as reviewers.

Newuser@lemmy.world · 2 days ago

Are you guys hiring freshers by any chance ?

corsicanguppy@lemmy.ca · 2 days ago

freshers

Is that the new name for “engineers ready to un-slop the code”?

lando55@lemmy.zip · 2 days ago

We need some freshers in here to do skin jobs

MangoCats@feddit.it · 2 days ago

Tyrell corp designed skin jobs for the Nexus-6 series, like Pris.

MangoCats@feddit.it · 2 days ago

I thought “engineers ready to un-slop the code” were “forward deployed engineers” supplied by the AI companies…

Vanth@reddthat.com · 2 days ago

Best part of the article, hat tip to author Emanuel for how he included the correction request:

After this story was published Google’s spokesperson reached out and asked us to publish a slightly different version of that statement. The new statement no longer stated that “it’s critical that we maintain humans in the loop.”

KeenFlame@feddit.nu · 19 hours ago

“No no that is wrong! We said fuck all kids too, we really meant everyone! not just the adults??”

Th4tGuyII@fedia.io · 2 days ago

Its a very damning line to retract, but I don’t think anybody is surprised at this point

A_norny_mousse@piefed.zip · edit-2 2 days ago

Google: 🙋 “Erm, sorry, your portrayal of our complete lack of ethics is incomplete. Thank you.”

Deebster@infosec.pub · 2 days ago

I was going to quote this part as well - nice bit of malicious compliance.

Airfried@piefed.social · 2 days ago

Sign up for free access to this post.

Seb the goblin@lemmy.world · 2 days ago

removepaywall.com to the rescue

Fmstrat@lemmy.world · 1 day ago

Not really. They’ve just archived the paywall page.

LetThereBeNick@lemmy.zip · 8 hours ago

Option 3 works for me

MrKoyun@lemmy.world · 8 hours ago

You can sift through the various sources to get the un-paywalled page from the correct archive.

uuj8za@piefed.social · edit-2 2 days ago

Google’s CEO says 75% of the company’s code is AI-generated.

Everyone should take this with a huge grain of salt. Like all other internal company stat reports, it’s bullshit and manufactured.

Example: my company has recently introduced a gate on CI. All commits must have “Co-Authored-By: X”. Technically, you can set X=None, but most people aren’t doing that because we’re not stupid and we know the commit history can easily be data mined and used to generate stats on who is or isn’t using AI. And we don’t want to get fired.

Result: 99% of all new commits use “Co-Authored-By: Claude”. Every commit I make now has “Co-Authored-By: Claude”. Am I using AI? FUCK NO. But, now I have to add that stupid line to any work I turn in.

mcv@lemmy.zip · 2 days ago

This is insane to me. Having a way to easily distinguish AI generated commits from human created ones makes a lot of sense, but lying that your honest, high quality handcrafted commit is AI slop makes it pointless.

That people feel they need to do this in order to protect their jobs is fucking insane and self destructive.

criss_cross@lemmy.world · 2 days ago

We have a commit skill we’re supposed to use. So for non-trivial work that I don’t want the AI to screw up i do it by hand then use the skill so it can vomit put a commit message and PR.

I get the shiny “Co-Authored-By: Claude” and burn a ton of tokens to make myself look “AI Fluent”

Steve@startrek.website · 2 days ago

Remember that part in The Big Short where the stripper is talking about all the houses she owns? Similar vibes.

0x0@infosec.pub · 2 days ago

Microslop really went to shit after statements just like that. Can’t wait for google to implode too

masterspace@lemmy.ca · edit-2 2 days ago

We’re a small company so I do the opposite and am avoiding any co-authored tag being applied to the code I publish.

I review and test my code before it’s published to make sure that it works and that it’s the right solution to the problem, and I’m the one responsible for fixing it if it goes wrong late at night in prod.

That was the case when I was using Intellisense and codegen tools and that’s still the case now.

That makes me the author.

Anything else is a lie, a violation of engineering ethics, and is flat out not SOC2, nor regulatorily compliant for anything that matters.

A_norny_mousse@piefed.zip · 2 days ago

Stark reality. Thank you.

one_old_coder@piefed.social · 2 days ago

Can’t you script it with a git hook?

Th4tGuyII@fedia.io · 2 days ago

“We encourage our engineers to vigorously test and critique our internal tools; that candid feedback loop, even via our internal meme generator, is vital to how we build technology”

Google listening to employee feedback:

uuj8za@piefed.social · 2 days ago

Honestly, that would be great if they just tossed it out the window.

What they’re probably doing is building a list of who they should layoff next based on the feedback.

criss_cross@lemmy.world · 2 days ago

And tossing them out the window.

toiletobserver@lemmy.world · 2 days ago

Their meme game is weak

MrKoyun@lemmy.world · 8 hours ago

It says on the article that 404 Media recreated similiar images to the memes they saw to protect their sources, so there is a chance that the originals were pure gold.

OwOarchist@pawb.social · 2 days ago

Maybe they should try having AI generate some memes for them.

Deebster@infosec.pub · 2 days ago

Kinda weird experience to be reading textual descriptions of memes and having to reconstruct them in my head. They had enough to say to not need to pad out their word count that way.

Vanth@reddthat.com · 2 days ago

They’re probably doing that to protect the identity of any Google workers providing them with information. If they posted the actual meme, Google could possibly trace it back to an employee and fire them.

Some of the memes they do have in the article, they note they are reconstructions and not the actual memes from Googles internal channels.

I agree it’s long though, they could have just recreated them and skipped the written description.

KeenFlame@feddit.nu · 19 hours ago

Anonymous Memes

some pirate@lemmy.dbzer0.com · 2 days ago

You need to hallucinate the memes

reksas@sopuli.xyz · 2 days ago

paid tons of money to fool around while some who would be willing to work dont get hired no matter what

ddplf@szmer.info · 2 days ago

Also a big, chunky and oily FUCK YOU to all of you who work for or aspire to work for FAANG, MAAMA or whatever fucking letters you call it these days

MrKoyun@lemmy.world · 8 hours ago

MOGAI

benjirenji@slrpnk.net · 2 days ago

GAYMAN