moth main, no llms, all human

  • 2 Posts
  • 93 Comments
Joined 5 months ago
cake
Cake day: February 26th, 2025

help-circle




  • You don’t need to close the source to make X authoritatian. The existing unequality in (computational) power can’t be penetrated as it is, without regulations, that’s why it’s so sexy for Google and the likes, and the only reason to close the source for them is to fuck with each other. The means of training models and dictating their usage is in the hands of those who can afford that.

    If one day portrayal of taco gets shadowbanned and you’d want to generate one, you’d not find a big model that can draw it unless you waste days training your own Taco model.









  • the beginning of the end for Putin. It is only a matter of time.

    With his age, he has all biological means to fuck around and never find out. I believe the legacy talk captures him, but the fact he himself would never meet any pushback makes many of his decisions way easier than if he was 40yo and had another 40+ to his rule. Any other minute of him fucking up foreign countries with a weaponised influence, not to say direct warfare, is a minute too much.

    He could’ve tried to revive the country, and he got an easy start in the 00s, but he chosed to gamble it instead.




  • One thing they didn’t focus on is how it explicitly makes sexual minorities feel unsafe. When Pornhub enabled age verification in my conutry via soc media authorization, I freaked out and stopped using it because I didn’t want my real identity being connected to my watch history in any way. It wasn’t disclosed on entry if the VA broker can collect info on what I’m interested in, and it is a major security risk if one’s state becames authoritarian and would get a handy database of people watching everything but vanilla.






  • Each legally licensable piece of source material could provide a hash, maybe?

    We may generate a hash sum for every piece but I don’t see now how it would help. The only application I assume is to know that between stages A and B the database of many works hasn’t been modified. But if we have a hash of a singular piece, we can’t tell by it, if it was included in the dataset or not, persecute cases of it’s misuse etc. For licensing stuff it wouldn’t hurt to obtain it, I guess, but I don’t know how it would be applied to prove something. Alas, I think I do now*.

    In terms of outputs, it’s really difficult to judge how transformative a model is without transparency of dataset.

    True. That’s why I assume everything in the dataset is involved in every creation.

    It is, probably, the level of fight only accessible by the likes of Disney with their endless pockets, but if they do their lawsuit thing frequently enough (correctly assumimg the likeness of Mickey is in every graphical dataset), there’s a hope LLM’s owners and dataset brokers would go more transparent about the data they obtain and use, thus helping everyone.

    One tool I see created is - here’s the asterix * - a standard look-up webpage where you can search a closed commercial dataset (or many of them at once) by hash or by providing a file**. Hash sux ass due to it naturally changing itself whenever the file is slightly modified. But if it’s a known copy-version that circulated the web for a while, it can serve as a unique identifier as that one thing.

    Asterix two** - I imagine if something like that occures, it’d be a captcha-, ad-, js-code-ridden nightmare. If there could be a bill about that whole thing, the look-up site should be included too, with instructions to make an API for that resource and limitations on how awful it can be.