cross-posted from: https://feddit.org/post/28915273

[…]

That marketing may have outstripped reality. Early reports from Mythos preview users including AWS and Mozilla indicate that while the model is very good and very fast at finding vulnerabilities, and requires less hands-on guidance from security engineers - making it a welcome time-saver for the human teams - it has yet to eclipse human security researchers.

“So far we’ve found no category or complexity of vulnerability that humans can find that this model can’t,” Mozilla CTO Bobby Holley said, after revealing that Mythos found 271 vulnerabilities in Firefox 150. Then he added: “We also haven’t seen any bugs that couldn’t have been found by an elite human researcher.” In other words, it’s like adding an automated security researcher to your team. Not a zero-day machine that’s too dangerous for the world.

  • ashughes@feddit.uk
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    2
    ·
    2 days ago

    How much better is Mythos than Opus 4.6 or 4.7, or Sonnet for that matter?

    Opus 4.6 resulted in 22 fixes in Firefox 148, compared to 271 fixes with Mythos in Firefox 150.

    source

      • frongt@lemmy.zip
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        1 day ago

        Firefox is a massive program, so yeah it’s gonna have a lot of bugs. Even a simple HTML rendering browser is a complex program.

          • Nalivai@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 day ago

            What do you do with your browsers so they crash? Mine didn’t do that in at least a decade

            • MangoCats@feddit.it
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 day ago

              More often than crashing outright, I hit situations where the browser just isn’t working, won’t load pages or won’t execute button clicks on pages or similar and the only thing (on Windows) that will fix it is a reboot. In Linux usually closing the browser and restarting will get it going again. Yeah, BSODs are rare lately (though not entirely gone), but malfunctions still abound.

              • Nalivai@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                23 hours ago

                Interesting. So far, all my experiences with stuff like that turned out to be faulty hardware.

                • MangoCats@feddit.it
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  22 hours ago

                  My last (confirmed) faulty hardware crash (resulting from user operation, not just an outright failure to boot, or random crash “for no particular reason” other than a program trying to access a failing SSD or similar) came in the late 90s with a GPU card that would take down the system bus voltage in response to certain CAD operations - repeatably - do this rotation, watch the CPU do a hard reboot every time. Stay away from the GPU heavy operations - no problems.

                  These days the browser is the OS for over half of what happens on my work machines. And they’re almost, but not quite, 100% reliable, until they’re not. Working out those rare problems takes a long time, and with “progress” it feels like they’ve reached a kind of equilibrium where the rate of new problem introduction is about the same as the rate of known problem fixes.