• verassol
    link
    fedilink
    401 year ago

    StackOverflow: *grabs money on monetizing massive amounts of user-contributed content without consulting or compensating the users in any way*

    Users: *try to delete it all to prevent it*

    StackOverflow: *your contributions belong to the community, you can’t do that*

    Pretty fucked-up laws. A lot of lawsuits going on right now against AI companies for similar issues. In this case, StackOverflow is entitled to be compensated for its partnership, and because the answers are all CC BY-SA 3.0, no one can complain. Now, that SA? Whatever.

    • @9point6@lemmy.world
      link
      fedilink
      151 year ago

      That SA part needs to be tested in court against the AI models themselves

      A lot of this shittiness would probably go away if there was a risk that ingesting certain content would mean you need to release the actual model to the public.

      • verassol
        link
        fedilink
        4
        edit-2
        1 year ago

        Yeah, their assumption though is you don’t? Neither attribution nor sharealike, not even full-on all-rights-reserved copyright is being respected. Anything public goes and if questions are asked it’s “fair use”. If the user retains CC BY-SA over their content, why is giving a bunch of money to StackOverflow entitling OpenAI to use it all under whatever terms they settled on? Boggles me.

        Now, say, Reddit Terms of Service state clearly that by submitting content you are giving them the right to “a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness (…) in all media formats and channels now known or later developed anywhere in the world.” Speaks volumes on why alternatives (like Lemmy) to these platforms matter.

  • @darkphotonstudio@beehaw.org
    link
    fedilink
    371 year ago

    I think people would have less issues with AI training if it was non-profit and for the common good. And there are open source AI projects, many in fact. But yeah, these deals by companies like this are sleazy.

    • @mnemonicmonkeys@sh.itjust.works
      link
      fedilink
      English
      61 year ago

      One time I was went on there to figure out an issue in Arduino. The answer one guy gave was “I don’t know how to do this in Arduino, here’s how you do this in Java”. Not only the the mods prevent any other answers from being posted, I tried the guy’s suggestion in Java and it didn’t even work

  • davel [he/him]
    link
    fedilink
    English
    291 year ago

    Good luck with the deleting. It often just means UPDATE comments SET is_deleted = 1 WHERE ID = 666;.

    • plz1
      link
      fedilink
      English
      51 year ago

      They are not deleting, they are editing. So the platform would have to undo those edits rather than just flipping the visibility flag.

    • wuphysics87
      link
      fedilink
      121 year ago

      Those answers were given in good faith under the presumption that they would be read and used by another person. Not used to train something to remove the interactions which motivated the answer in the first place.

      • @jsomae@lemmy.ml
        link
        fedilink
        21 year ago

        Can you elaborate on what you mean by “remove the interactions which motivated the answer in the first place”? I’m not sure I follow.

          • The internet had a social contract. The reason people put effort into brain dumping good posts is because the internet was a global collaborative knowledge base for everybody.

            Of course there were always capitalists who sought to privatize and profit from resources. The source materials were generally part of the big giant digital continuum of knowledge. For the parts that weren’t there we’re anarchists who sought to free that knowledge for anyone who wanted to access it.

            AI is bringing about the end of all this as platforms are locking down everything. Old boards and forums had already been shuttering for years as social media was centralizing everything around a few platforms. Now those few platforms are being swallowed up by AI where the collective knowledge of humanity is being put behind paywalls. People no longer want to work directly for the profit of private companies.

            Capitalists can only see dollar signs. They care not for the geological epoch scale forces of nature required to form petroleum. All that matters is can it all be sold and how quickly. Nor do they care for environmental damages they cause. In the same way the AI data mining do not care for the digital ecological disaster they are causing.

            More over it’s a thought terminating cliche when someone says, “<thing> existed before so why’s it suddenly a problem?”. It seems to be yet another out of the bag of rhetorical tricks that wipes the slate of discourse clean. As if all the arguments against it suddenly need to be explained as if none of it had any validity. Not only that but the OPs are often seemingly disingenuously naive. It provides the OP with a blank slate to continually “just ask questions”. Where every response is “but why?” which forces their interlocutors to keep on elaborating in excruciating detail to the point where they give up trying to explain minutiae. Thus the OP can conclude by default they were correct that it’s not a problem after all because they declare nobody has provided them with answers to their satisfaction.

  • @baseless_discourse@mander.xyz
    link
    fedilink
    16
    edit-2
    1 year ago

    This is a violation of GDPR, no?

    EDIT: user created content is not directly protected under GDPR, only personally identifiable data is pertected under GDPR.

    • lemmyreaderOP
      link
      fedilink
      English
      121 year ago

      Dunno. GDPR is a Europe only thing, and isn’t it only related to how your private data (like name, IP address, phone number) is cared about ?

      • @AccountMaker@slrpnk.net
        link
        fedilink
        71 year ago

        Right, I think it only covers personal information: companies can only collect what they need to run their service, users can request to see their data etc. I don’t think it applies to comments and posts.

        • @flux@lemmy.ml
          link
          fedilink
          31 year ago

          Would that kind of provision allow me to have my code removed from a git repository history, if that git repository is hosted by a company?

          • @baseless_discourse@mander.xyz
            link
            fedilink
            1
            edit-2
            1 year ago

            I am not a lawyer, but I believe in general, yes.

            Git is not even that convoluted, as all the history is stored in the .git folder within the repo. Unless there is some convoluted structure built on top, they would only need to move the repo folder to a trash disk, waiting to be formated.

            That being said, GDPR is somewhat poorly enforced at the moment, unfortunately. I don’t know if you can sue the company and expect some result within couple of years.

          • I am not a expert or a lawyer, but I believe user actually hold the right to completely erase personal data:

            The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay

            https://gdpr.eu/right-to-be-forgotten/

            Note the word “erasure” as opposed to “anonymize”

            • @WldFyre@lemm.ee
              link
              fedilink
              41 year ago

              I don’t think that addresses my point. Is my opinion on the new Star Wars movies that I post online or some lines of code I suggest “personal data”? I thought personal data had a specific definition under GDPR

              • @nefonous@lemmy.world
                link
                fedilink
                41 year ago

                You’re totally right, the content of your posts is not considered personal data (because it isn’t) It’s more about profiling data that can be connected back to your actual person

              • I think you are right, user generated content doesn’t seem to be protected. This is surprising to me, as user should hold the right to their content, which in my mind should enjoy stronger protection than personal data.

              • Spaenny
                link
                fedilink
                21 year ago

                Technically, they could retain posts from users if they are irreversibly anonymized. However, ensuring with 100% certainty that none of your posts ever contained any personal data that could lead to the identification of you as an individual is challenging. The safest option is therefore to also delete your posts.

  • HexesofVexes
    link
    fedilink
    71 year ago

    I mean, here is a thought, if an AI tool uses creative commons data, then it’s derivatives fall under creative commons. I.e. stop charging for AI tools and people will stop complaining.

  • Sibbo
    link
    fedilink
    51 year ago

    Does GDPR apply to stackoverflow? Since my data there probably does not identify me as a person?

  • This shit scares me. It will become so easy to rewrite history from here. Just delete anything you don’t like and have an ai rewrite into whatever you want. Entire threads rewritten, a company can go back and have your entire post history can be changed in ways that might be legally compromising.