

Open-source AI is the data, not the weights. Weights alone isn’t useless, but it’s not a platform to truly understand and modify a network. No one wants to open their data though, because that’s both what gives them a competitive edge and it would be an admission of widespread copyright violations.


Using the best models as synthetic data is mostly pointless. You’re just going to recreate all of its biases and failures in a degraded copy. The whole point of open source software is being able to analyze the source code to learn how it works and understand and ideally remove its weaknesses.
Open weights doesn’t let you do that, and what research it enables is mostly just tinkering around the edges. If someone trained a network, but it kept saying racist stuff, you can’t figure out why it’s racist or rebuild it without the racism from weights alone. Just the weights is like having a binary. Maybe nice to have a gratis app to use, but not really open.