• 6 Posts
  • 501 Comments
Joined 2 years ago
cake
Cake day: July 7th, 2023

help-circle

  • From your own linked paper:

    To design a neural long-term memory module, we need a model that can encode the abstraction of the past history into its parameters. An example of this can be LLMs that are shown to be memorizing their training data [98, 96, 61]. Therefore, a simple idea is to train a neural network and expect it to memorize its training data. Memorization, however, has almost always been known as an undesirable phenomena in neural networks as it limits the model generalization [7], causes privacy concerns [98], and so results in poor performance at test time. Moreover, the memorization of the training data might not be helpful at test time, in which the data might be out-of-distribution. We argue that, we need an online meta-model that learns how to memorize/forget the data at test time. In this setup, the model is learning a function that is capable of memorization, but it is not overfitting to the training data, resulting in a better generalization at test time.

    Literally what I just said. This is specifically addressing the problem I mentioned, and goes on further to exacting specificity on why it does not exist in production tools for the general public (it’ll never make money, and it’s slow, honestly). In fact, there is a minor argument later on that developing a separate supporting system negates even referring to the outcome as an LLM, and the supported referenced papers linked at the bottom dig even deeper into the exact thing I mentioned on the limitations of said models used in this way.


  • It most certainly did not…because it can’t.

    You find me a model that can take multiple disparate pieces of information and combine them into a new idea not fed with a pre-selected pattern, and I’ll eat my hat. The very basis of how these models operates is in complete opposition of you thinking it can spontaneously have a new and novel idea. New…that’s what novel means.

    I can pointlessly link you to papers, blogs from researchers explaining, or just asking one of these things for yourself, but you’re not going to listen, which is on you for intentionally deciding to remain ignorant to how they function.

    Here’s Terrence Kim describing how they set it up using GRPO: https://www.terrencekim.net/2025/10/scaling-llms-for-next-generation-single.html

    And then another researcher describing what actually took place: https://joshuaberkowitz.us/blog/news-1/googles-cell2sentence-c2s-scale-27b-ai-is-accelerating-cancer-therapy-discovery-1498

    So you can obviously see…not novel ideation. They fed it a bunch of trained data, and it correctly used the different pattern alignment to say “If it works this way otherwise, it should work this way with this example.”

    Sure, it’s not something humans had gotten to get, but that’s the entire point of the tool. Good for the progress, certainly, but that’s it’s job. It didn’t come up with some new idea about anything because it works from the data it’s given, and the logic boundaries of the tasks it’s set to run. It’s not doing anything super special here, just very efficiently.



  • 🤦🤦🤦 No…it really isn’t:

    Teams at Yale are now exploring the mechanism uncovered here and testing additional AI-generated predictions in other immune contexts.

    Not only is there no validation, they have only begun even looking at it.

    Again: LLMs can’t make novel ideas. This is PR, and because you’re unfamiliar with how any of it works, you assume MAGIC.

    Like every other bullshit PR release of it’s kind, this is simply a model being fed a ton of data and running through thousands of millions of iterative segments testing outcomes of various combinations of things that would take humans years to do. It’s not that it is intelligent or making “discoveries”, it’s just moving really fast.

    You feed it 102 combinations of amino acids, and it’s eventually going to find new chains needed for protein folding. The thing you’re missing there is:

    1. all the logic programmed by humans
    2. The data collected and sanitized by humans
    3. The task groups set by humans
    4. The output validated by humans

    It’s a tool for moving fast though data, a.k.a. A REALLY FAST SORTING MECHANISM

    Nothing at any stage if developed, is novel output, or validated by any models, because…they can’t do that.


  • I sure do. Knowledge, and being in the space for a decade.

    Here’s a fun one: go ask your LLM why it can’t create novel ideas, it’ll tell you right away 🤣🤣🤣🤣

    LLMs have ZERO intentional logic that allow it to even comprehend an idea, let alone craft a new one and create relationships between others.

    I can already tell from your tone you’re mostly driven by bullshit PR hype from people like Sam Altman , and are an “AI” fanboy, so I won’t waste my time arguing with you. You’re in love with human-made logic loops and datasets, bruh. There is not now, nor was there ever, a way for any of it to become some supreme being of ideas and knowledge as you’ve been pitched. It’s super fast sorting from static data. That’s it.

    You’re drunk on Kool-Aid, kiddo.









  • Just based on experience in the community and professional experience, I can solidly say that your take on FOSS not being successful is just wrong, and I don’t mean that like you’re stupid or I’m shooting you down, you just wouldn’t realize how huge contributions are unless you know where to look.

    Here’s a big example: look how many companies hire for engineers writing Python, Ruby, Rust, Go, Node…whatever. ALL OPEN SOURCE LANGUAGES. You bootstrap a project in any of these, and you’re already looped into the FOSS community. 100% of the companies I have personally worked with and for write everything based on FOSS software, and I can tell you hands down as a fact: never met a single person writing in any closed source IDEs or languages, because very few exist.

    If you want to see where all the community stuff happens, find any project on GitHub and look at the “Issues” section for closed tickets with PRs attached. You’ll see just how many people write quick little fixes to nags or bugs, not just on their own behalf, but on behalf of the companies paying them. That’s sort of the beauty of the FOSS community in general in that if you want to build on community projects, you’ll be giving back in one form another simply because, as my last comment said, NOBODY wants to maintain a private fork. Submodules exist for a reason, and even then people don’t want to mess with that, they’d rather just commit fixes and give back. Companies are paying engineers for their time, and engineers committing PR fixes is defacto those companies putting back into the community.

    To your Oracle point, I think the biggest thing there you may have been Java. That one is tricky. Java existed long before it was ever open sources by Sun Microsystems, and was available for everyone sometime in the early '00s (not bothering to look that up). Even though it was created by an engineer at Sun, it was always out there and available for use, it just wasn’t “officially” licensed as Open Source for contributions until some time. Sun still technically owned the trademarks and all of that though, and Oracle acquired them at some point, bringing the trademarks under their ownership. There wete a number of immediate forks, but I think the OpenJDK crew was further out in front and sort of won that battle. To this day I don’t know a single Java project using Oracle’s official SDK and tools for that language aside from Oracle devs, which is a pretty small community in comparison, but you’re right in that was essentially a corporate takeover of a FOSS project. How successful it was in bringing people to bear that engagement I think is up for discussion, but I’m sure the community would rightly say “Fuck, Oracle” and not engage with their tooling.


  • There’s a few different things getting wrapped in here together, so let me break down my take:

    1. Licensing - if you intend to only use FOSS software, it wouldn’t matter if a corporate/proprietary version of something exists or not. If you intend to release something and make it free, you would need to include only license-compatible libraries. I don’t see why Microsoft having a proprietary version of something that is better would be a problem, because that’s not the focus of your goal of releasing something for free. Similarly if you start a company and bootstrap a product off of open libraries, you will steer towards projects that are license-compatible. Whether there is a better version is irrelevant.

    2. Scope of license - Your comments seem to focus on larger product-complete projects. You mentioned Paint.net as an example. So say Adobe forks GIMP, and drops a bunch of proprietary Photoshop libraries into it to make it beefier or whatever. Similar to the above, people who intend to only use FOSS software still wouldn’t adopt it.

    3. Death by license - there have been some cases where FOSS project maintainers get picked up by corporate sponsors and sort of “acquired”. This is on the maintainers to make that choice of course, not the community, and contributing members of that community have every right to be pissed about that. Those contributing members also have the right to immediately fork that project, and release their own as a competitive product. Redis vs Valkey, and Terraform vs OpenTofu, are examples. Some people flock away, some people don’t, but in most cases ts a guaranteed way to turn the community against you, and towards a fork of said project. Happens a lot.

    I think what you’re not seeing here is that these companies buying out projects really don’t intend to put a lot of money back into them after they get their bags of money. Whether or not people continue to use the originals is less important than the forks being available and supported. If companies believe in the project, they kick in PRs to keep things rolling along because they need that particular part of their stack. I myself am a maintainer in multiple public projects, and also work with companies that contributed to dozens of different public projects because the products they make revolve around them: everything from ffmpeg, to the torch ecosystem. You find a bug you can fix, you submit a PR. That’s what keeps this ecosystem going.

    Smaller scale startups to mid-sized companies contribute all the time to public projects, though it may not be apparent. Larger corporations do as well, but it’s more of legal thing than an obligation to the community. Rewriting entire batches of libraries isn’t feasible for these larger companies unless there is a monetary reward on the backend, because paying dev teams millions of dollars to rewrite something like, I don’t know, memcache doesn’t make sense unless they can sell it, and keeping an internal fork of an open project downstream is a huge mess that no engineer wants to be saddled with.

    Once a public project or library is adopted, it’s very unlikely to be taken over by corporate interests, and it’s been that way for almost fifty years now (if we’re going back to Bell and Xerox Labs). Don’t see that changing anytime soon based on the above, and being in the space and seeing it all work in action. Though there are scant cases, there’s no trend of this becoming more prevalent at the moment. The biggest threat I see to this model is the dumbing down of engineers by “AI” and loss of will and independent thought to keep producing new and novel code out in the world.



  • What does security have to do with open-source projects succumbing to “corporate takeover”, which isn’t even possible?

    If the code is of such a restrictive license that you aren’t able to fork and re-release it with changes, then it isn’t open-source to begin with.

    To your last point about removing “old features”, this is done all the time, and this is why things use semantic versioning. Nobody wants to be forced to maintain old code into perpetuity when they can just drop large portions of it, and then just release new versions with deprecated backends when needed