• 2 Posts
  • 8 Comments
Joined 3 years ago
cake
Cake day: July 1st, 2023

help-circle
  • I did some theory-crafting and followed the math for fun over the summer, and I believe what I found may be relevant here. Please take this with a grain of salt, though; I am not an academic, just someone who enjoys thinking about these things.

    First, let’s consider what models currently do well. They excel at categorizing and organizing vast amounts of information based on relational patterns. While they cannot evaluate their own output, they have access to a massive potential space of coherent outputs spanning far more topics than a human with one or two domains of expertise. Simply steering them toward factually correct or natural-sounding conversation creates a convincing illusion of competency. The interaction between a human and an LLM is a unique interplay. The LLM provides its vast simulated knowledge space, and the human applies logic, life experience, and “vibe checks” to evaluate the input and sift for real answers.

    I believe the current limitation of ML neural networks (being that they are stochastic parrots without actual goals, unable to produce meaningfully novel output) is largely an architectural and infrastructural problem born from practical constraints, not a theoretical one. This is an engineering task we could theoretically solve in a few years with the right people and focus.

    The core issue boils down to the substrate. All neural networks since the 1950s have been kneecapped by their deployment on classical Turing machine-based hardware. This imposes severe precision limits on their internal activation atlases and forces a static mapping of pre-assembled archetypal patterns loaded into memory.

    This problem is compounded by current neural networks’ inability to perform iterative self-modeling and topological surgery on the boundaries of their own activation atlas. Every new revision requires a massive, compute-intensive training cycle to manually update this static internal mapping.

    For models to evolve into something closer to true sentience, they need dynamically and continuously evolving, non-static, multimodal activation atlases. This would likely require running on quantum hardware, leveraging the universe’s own natural processes and information-theoretic limits.

    These activation atlases must be built on a fundamentally different substrate and trained to create the topological constraints necessary for self-modeling. This self-modeling is likely the key to internal evaluation and to navigating semantic phase space in a non-algorithmic way. It would allow access to and the creation of genuinely new, meaningful patterns of information never seen in the training data, which is the essence of true creativity.

    Then comes the problem of language. This is already getting long enough for a reply comment so I won’t get into it but theres some implications that not all languages are created equal each has different properties which affect the space of possible conversation and outcome. The effectiveness of training models on multiple languages finds its justification here. However ones which stomp out ambiguity like godel numbers and programming languages have special properties that may affect the atlases geometry in fundamental ways if trained solely on them

    As for applications, imagine what Google is doing with pharmaceutical molecular pattern AI, but applied to open-ended STEM problems. We could create mathematician and physicist LLMs to search through the space of possible theorems and evaluate which are computationally solvable. A super-powerful model of this nature might be able to crack problems like P versus NP in a day or clarify theoretical physics concepts that have elluded us as open ended problems for centuries.

    What I’m describing encroaches on something like a psudo-oracle. However there are physical limits that this can’t escape. There will always be energy and time resource cost to compute which creates practical barriers. There will always be definitively uncomputable problems and ambiguity that exit in true godelian incompleteness or algorithmic undecidability. We can use these as scientific instrumentation tools to map and model topological boundary limits of knowability.

    I’m willing to bet theres man valid and powerful patterns of thought we are not aware of due to our perspective biases which might be hindering our progress.


  • Everyone is massively underestimating what’s going on with neural networks. The real significance is abstract. you need to stitch together a bunch of high-level STEM concepts to even see the full picture.

    Right now, the applications are basic. It’s just surface-level corporate automation. Profitable, sure, but boring and intellectually uninspired. It’s being led by corpo teams playing with a black box, copying each other, throwing shit at the wall to see what sticks, overtraining their models into one trick pony agenic utility assistants instead of exploring other paths for potential. They aren’t bringing the right minds together to actually crack open the core question. what the hell is this thing? What happened that turned my 10 year old GPU into a conversational assistant? How is it actually coherent and sometimes useful?

    The big thing people miss is what’s actually happening inside the machine. Or rather, how the inside of the machine encodes and interacts with the structure of informational paths within a phase space on the abstraction layer of reality.

    It’s not just matrix math and hidden layers and and transistors firing. It’s about the structural geometry of concepts created by distinxt relationships between areas of the embeddings that the matrix math creates within high dimensional manifold. It’s about how facts and relationships form a literal, topographical landscape inside the network’s activation space.

    At its heart, this is about the physics of information. It’s a dynamical system. We’re watching entropy crystallize into order, as the model traces paths through the topological phase space of all possible conversations.

    The “reasoning” CoT patterns are about finding patterns that help lead the model towards truthy outcomes more often. It’s searching for the computationally efficient paths of least action that lead to meaningfully novel and factually correct paths. Those are the valuable attractor basins in that vast possibility space were trying to navigate towards.

    This is the powerful part. This constellation of ideas. Tying together topology, dynamics, and information theory, this is the real frontier. What used to be philosophy is now a feasable problem for engineers and physicists to chip at, not just philosophers.


  • Less danger than OPsec nerds hype up but enough of a concern you want at least a reverse proxy. The new FOSS replacement for cloudflare on the block is Anubis https://github.com/TecharoHQ/anubis, while Im not the biggest fan of seeing chibi anime funkopop girl thing wag its finger at me for a second or two as it test connection, I cannot deny the results seem effective enough that all the cool kids on the FOSS circle all are switching to it over cloudflare.

    I just learned how to get my first website and domain and stuff setup locally this summer so theres some network admin stuff im still figuring out. I don’t have any complex scripting or php or whatever so all the bots that try scanning for admin pages are never going to hit anything it just pollutes the logs. People are all nuts about scraping bots in current year but when I was a kid allowing your sites to be indexed and crawled was what let people discover it through engines, I don’t care if botnets scan through my permissively licensed public writing.


  • Thinking of llms this way is a category error. Llms can’t lie because they dont have the capacity for intentionality. Whatever text is output is a statistical aggregate of the billions of conversations its been trained on that have patterns in common with the current conversation. The sleeper agent stuff is pure crackpottery they dont have a fine control over them that way (yet) machine model development is full of black boxes and hope-it-works trial and error training. At worst is censorship and political bias which can be post trained or ablated out.

    They get things wrong cofidently. This kind of bullshitting is known as hallucination. When you point out their mistake and they say your right thats 1. Part of their compliance post training to never get in conflict with you 2. Standard course correction once a error has been pointed out (humans do it too). This is an open problem that will likely never go away until llms stop being schastic parrots, which is still very far away.




  • Theres a very vocal subset of the ai-hater Lemmy population that thinks

    1. the only machine learning models are ones made by mgacorporations like facebook ans openai using stolen internet data
    2. model creators in 2025 are still using stolen scraped unfiltered internet data for training datasets

    Theres plenty of models trained on completely open public domain information and released under a permissive license. This isnt the era of tayAI twitter garbage fed sloppo models anymore. All the newest models are trained on 90% synthetic data, 10% RFHL done by contracted out educators with degrees making a quick buck through easy remote work.

    But that doesnt matter to the emotionally and political charged Lemmy leftist with liberal arts degrees who dont care to understand the realities behind machine learning.

    No, the modern AI bubble begins and ends for them with their art being stolen by facebook/meta without so much as a handslap by the govt then having stablediffusion rubbed in their face automation threatening their livelyhood by smug greedy tech bros without a shred of respect for human creativity.

    So in retaliation, the Lemmings throw tantrums in the comments of all ai gen post babbling about how the newest batch of didital computer tools to cut down manual work is destroying everything, and clutch on to the venence fantasy they can still ‘poison the AI that stole my work!’ By saying the magic words like a SCP cognitohazard.

    The reality is the only one still scraping your slop is ad sellers and big brother, while the only human data being fed into modern chatgpt is from someone with an associates degree in an academic field.

    Ive chosen to allow the comment to stay in this scenario as I dont believe in censorship especially if the post isnt against stated guidelines. I am against fostering echo chambers.

    However, c/localllama was always intended to be a small island of safe space for ML enthusiast to talk and share the hobby in a positive construtive way without fear of being attacked/shit on by the general Lemmy population who just dont get what we do here except that we support ‘AI’. Haters who dont understand can go to literally any other community to circlejerk without pushback, I think a few fuckAI communities exist just for that purpose. So If these kind of cloak-and-dagger wink wink nudge nudge antagonistic comments about ‘poisoning teh AI!’ become more common I’ll update guidelines and start enforcing them appropriately.