
Yeah i agree with most of what you said. I don’t have massive issues with companies tracking and recording data. By default they should only be allowed to use that data themselves (which can get a bit murky when the company in question is that of a conglomerate) and you should have to explicitly allow the sharing of data to third parties that is separate to standard TOC’s.
GDPR tried to solve this but it kind of made a lot of the options available to the user a bit of a mess and overwhelming because there’s not much regulation about what can be done with data (somewhat - there actually are limitations but it’s not very well enforced), just that the user has to say they agree. And that’s not even thinking about how the banners and pop ups are obtrusive as fuck.
I’m not smart enough to know what the actual solution should be other than I know it needs to be better than it is now.





Sadly I think it’s more that there isn’t really a standard way to buy books and other media in bulk at the scale of which AI training usually requires. So the companies realise they can save both time and money in just pirating after calculating the fine risk. Its just a bonus that they usually get away with it and that the fines would likely be cheaper than a legit transaction. But i do think it’s the bulk data packaging that makes piracy look more attractive to them at the get-go.
Heck, even video game publishers often source their roms for their official re-releases from pirated copies because pirates are better at preserving data and keeping it in a nice friendly format. Easier to search for it on the web and download it then it is too goo into their own archives and rip it themselves, if they even still have original copies, cause they sure as hell didn’t keep their source code.