• @IchNichtenLichten@lemmy.world
        link
        fedilink
        English
        171 year ago

        You’ll get your refund eventually but first it will try and gaslight you that Air Canada is a woke mind virus before calling you an asshole and then stalking you.

        • @pdxfed@lemmy.world
          link
          fedilink
          English
          21 year ago

          “instead of the $3.50 refund, I’m also authorized to offer you some June 2025 $350 GME calls.”

      • SonnyVabitch
        link
        fedilink
        English
        11 year ago

        I just want to mark the occasion when my previous comment is on 69 points. Noice.

    • FaceDeer
      link
      fedilink
      91 year ago

      Negative examples are often just as useful for training an AI as positive ones. And it all depends on what you want to use the AI for. A moderator bot, for example, needs familiarity with the whole range of user responses it might see.

      • @aidan@lemmy.world
        link
        fedilink
        English
        4
        edit-2
        1 year ago

        That gives me actually a fun idea for a Lemmy instance, it has an automated review process that bans posts/comments that are too similar in style to reddit posts/comments.

    • Lvxferre [he/him]
      link
      fedilink
      English
      51 year ago

      A LLM that behaves like a typical Redditor? // What possible use is that?

      • [You] “Chatbot, please tell me which pokemon types are strong against Fairy.”
      • [Le Lebbit Moronbot] “I’m not sure if I understand, you calling me a chatbot? I’m so confused lol”
      • [You] “Moronbot, please tell me which pokemon types are strong against Fairy.”
      • [LLM] “Actually, you should be spelling it “Pokémon” lol”
      • [You] “Moronbot, which types are strong against Fairy?”
      • [LLM] “I assume you talking about fairies. Fairies are from mythology lmao”
      • [You] “Did people really waste water and electricity for this trash?”
      • [LLM] “Waaah, you’re toxic!!111one”
  • @garibaldi_biscuit@lemmy.world
    link
    fedilink
    English
    981 year ago

    This is what the 3rd party access to API was really all about.

    When API access was allowed , all reddit content was effectively free: They needed to ban 3rd party apps so they could sell the accumulated content. I expect using content to train AI also factors into it.

  • Tiger Jerusalem
    link
    fedilink
    English
    85
    edit-2
    1 year ago

    Reddit is a trove of user built content under the guise of community. What Spez did was to say “thanks for all the free work, suckers!”, put a price sticker on it, and laughed all the way to the bank.

    And this is why I’m not active on any Internet community anymore. Nevermind, I guess I just can’t help myself…

      • Tiger Jerusalem
        link
        fedilink
        English
        61 year ago

        Active as in “creating meaningful contributions and contributing to the overall knowledge base”. I still shit post from time to time.

        • pewter
          link
          fedilink
          English
          61 year ago

          This is going to be a really weird thing to argue, but I just casually read through a bunch of your comments and they seem like meaningful contributions.

          • Tiger Jerusalem
            link
            fedilink
            English
            41 year ago

            Well, I guess I can’t help myself… I’ll shitpost more from now on 😅

      • @xorollo@lemmy.world
        link
        fedilink
        English
        41 year ago

        Somebody asked chat GPT to appear to be a normal internet user to populate the comments section to manufacture content for normal Internet users to respond to so that they can continue building up their training models.

      • Scratch
        link
        fedilink
        English
        101 year ago

        What are they odds that they kept it in a backup?

        • @Crack0n7uesday@lemmy.world
          link
          fedilink
          English
          61 year ago

          Some 4chan users created a backup bot that auto saves every few hours, so if reddit didn’t do it already, 4chan has been doing it for a while. The bot was originally made for 4chan but repurposed for other websites, reddit included.

        • @Dozzi92@lemmy.world
          link
          fedilink
          English
          51 year ago

          Yeah, it’s all too late. Shit, PRISM was 2007, so there’s a copy of everything somewhere. Obviously different ends.

        • RBG
          link
          fedilink
          English
          11 year ago

          Depends. If they were smart they backed up every content that had a certain number of upvotes and/or a certain number of paragraphs and/or responses. Just to weed out all the 2-3 word comments that no one interacted with. If OP wrote mostly those then Reddit gives a shit about them deleting those.

    • FaceDeer
      link
      fedilink
      -931 year ago

      I do. It’s frankly selfish. Having an AI get training on my old comments costs me nothing and it results in the development of useful AI tools. Trying to sabotage that is petty and pointless. It’s not like you could somehow collect the fraction of a pittance that you think you’re owed retroactively. I never commented on Reddit thinking “awesome, I’m going to make bank on the content I’m generating here.”

      People complain about the capitalist mindset of the world and then they do this. Sigh.

      • @Nurse_Robot@lemmy.world
        link
        fedilink
        English
        691 year ago

        Defending giant corporations profiting off of uncompensated individuals, while criticizing anyone who doesn’t want to provide free labor to said corporations, is a disgusting take. Are you a CEO?

        • FaceDeer
          link
          fedilink
          -341 year ago

          The more accessible training data there is the easier it is for new AI projects to enter the field less dominant those “giant corporations” become.

          The free labour was already freely given. If someone doesn’t want to have shitposted on Reddit for free then maybe they shouldn’t have shitposted on Reddit for free.

          • @Nurse_Robot@lemmy.world
            link
            fedilink
            English
            271 year ago

            “if you didn’t want me to steal your intellectual property, you shouldn’t have thought of it in the first place”

            • @QuaternionsRock@lemmy.world
              link
              fedilink
              English
              7
              edit-2
              1 year ago

              No, you shouldn’t have posted it to Reddit, in which you were required to give them a perpetual license to use your IP in any way they see fit.

              For the record, I’m here because Reddit pissed me off when they axed the free API, and I’m pissed at myself for not expecting it. That’s what I get for accepting their terms and conditions, I guess.

              Edit: I also don’t accept the idea that using my content for training data is “fair use” when it is used to train proprietary models, especially ones in which the end user is allowed to prompt it to plagiarize or otherwise imitate my content.

            • @Fungah@lemmy.world
              link
              fedilink
              English
              41 year ago

              So, for an example of what the other user was talking about, I’m just some guy and for my first foray inyo programming / machine learning (I kind of just threw myself into the deep end) I modified stylegan 3 and trained it on about 500g of reddit porn that I scraped off reddit.

              Now, I stopped the training after about a week (it was going to take about a solid month on my rtx 2080 ti) when I found out stable diffusion existed but I learned a LOT from that experience.

              I couldn’t do that now. Arguably none of that was how any of that should be done but whatever.

            • FaceDeer
              link
              fedilink
              -211 year ago

              I’m not sure what you mean here. Nothing’s being stolen. Even if you think there needs to be permission for training an AI off of data, Reddit has that permission.

              • @Nurse_Robot@lemmy.world
                link
                fedilink
                English
                121 year ago

                I assume you’re more of a moron than a troll, which is disappointing. Regardless, you’re not worth my time, as I don’t think any argument could convince you to have an open mind and be willing to change. Good luck out there!

      • @TORFdot0@lemmy.world
        link
        fedilink
        English
        211 year ago

        I had an 11 year old account that I deleted all my old comments and posts from because of the API debacle. Does that make me selfish that I felt like Reddit wasn’t holding up its end of the unwritten agreement?

        Reddit doesn’t deserve my content anymore than I deserve access from the third party API.

        • FaceDeer
          link
          fedilink
          -181 year ago

          If you did it over the API debacle then you’re not one of the people I’m talking about here. This is about people deleting their content to prevent it from being used to train AIs.

          • @Voyajer@lemmy.world
            link
            fedilink
            English
            17
            edit-2
            1 year ago

            Do you not remember the real reason why the API debacle happened in the first place was to prepare for this moment? It was always about easy access to training data, third party apps got caught in the crossfire.

            • FaceDeer
              link
              fedilink
              -18
              edit-2
              1 year ago

              That’s ignoring an awful lot of other considerations. Obviously Reddit hasn’t explained itself in a trustworthy way, but a common belief at the time is that it was to force people to use the official Reddit mobile app so they could be subject to advertising.

      • Zellith
        link
        fedilink
        151 year ago

        Selfish? Perhaps you forget why people deleted their content in the first place.

      • @Voyajer@lemmy.world
        link
        fedilink
        English
        131 year ago

        It’s their comment to do with as they see fit. I can’t get mad at them for wanting to erase their presence on a site they don’t use anymore.

        • FaceDeer
          link
          fedilink
          -211 year ago

          And I’m free to judge them however I wish for their actions and intent.

      • @Hackerman_uwu@lemmy.world
        link
        fedilink
        English
        21 year ago

        What about people who just think “A.I.” Is dog shit and chat bots are a dumb obsession steering the industry in the wrong direction due to hype and money?

        • FaceDeer
          link
          fedilink
          -41 year ago

          What about them? I don’t see why they’d care what AI companies are doing in that case. They’d assume they were just wasting money on this stuff.

      • @gedaliyah@lemmy.world
        link
        fedilink
        English
        21 year ago

        For me it’s a privacy matter. Going through old posts (whether human or machine learning) can nor be used for anything good.

  • NutWrench
    link
    fedilink
    English
    321 year ago

    Reddit is all bots, porn, ads and political shit posts. Good luck getting any useful training content out of that.

    • @ladicius@lemmy.world
      link
      fedilink
      English
      161 year ago

      Maybe that’s the point? Training the AI to produce the blabbering bullshit that’s preferred in social media?

    • @PoliticalAgitator@lemmy.world
      link
      fedilink
      English
      81 year ago

      They don’t care if the AI produced is useful, they just want to milk as much money from their content as they can.

      The API changes were almost certainly just the groundwork for this and I called it at the time. The ridiculous pricing model for API access is because it’s aimed at the hottest tech companies, not third party app developers.

      The enshittification continues because it’s what neoliberalism demands. They’ll sell your content and the data they have about you and still show you ads, because that’s the most profitable. Ethics and product quality don’t even enter into it.

      • @Ilgaz@lemm.ee
        link
        fedilink
        English
        11 year ago

        Liberal market gives end users choice. If they don’t choose, they get the consequences.

        This is more like people choosing Trump like types and complaining. Alternative exists, choose it.

        • @PoliticalAgitator@lemmy.world
          link
          fedilink
          English
          1
          edit-2
          1 year ago

          “The free market can fix it” is just another neoliberal lie, pushed precisely because it doesn’t work. Rather than holding corporations accountable, it blames the population instead.

          The reality is that boycotting businesses isn’t always an option and when it is, it’s usually a luxury. Very few products are domestically and/or ethically produced and when they are, they’re extremely expensive, especially for people being fucked out of every cent by their bosses, landlords and utilities.

          It’s why the most hated companies in the world continue to bring in record profits.

          Regulations are the real answer, which is why neoliberals oppose them.

          • @Ilgaz@lemm.ee
            link
            fedilink
            English
            -3
            edit-2
            1 year ago

            I really don’t care about people who behave like they are living in North Korea or who wants a North Korean World to live in.

            Even Digg people could say “No, F you” to Digg superstar owners. It is just a damn URL to type.

    • Queen HawlSera
      link
      fedilink
      English
      11 year ago

      I wish it would die, because honestly some of the porn was great and Lemmy seems to be the one place on the net that doesn’t specifically ban porn, yet has none of it anyway.

      I miss bodyswap and part tf captions…

  • ozoned
    link
    fedilink
    English
    301 year ago

    “Reddit has given access to YOUR conversations and posts to AI companies.”. FTFY

    These were created by people, for peoole, and I will ALWAYS disagree that this data is Reddit’s or any other platforms.

    Don’t forget your direct messages aren’t end to end encrypted on Reddit, so now AI will be trained on your craziest “private” conversations

  • etrotta
    link
    fedilink
    271 year ago

    Out of all things to hate Reddit for, giving data to AI isn’t something fediverse users can really criticize it for, though making money from it perhaps.
    Remember: All data in federated platforms is available for free and likely already being compiled into datasets. Don’t be surprised if this post and its comments end up in GPT5 or 6 training data.

    • FaceDeer
      link
      fedilink
      51 year ago

      After all the hue and cry I have seen over stuff like Threads and Bluesky federation I don’t imagine most people using the Fediverse have a particularly coherent philosophy on the matter.

    • @BrianTheeBiscuiteer@lemmy.world
      link
      fedilink
      English
      51 year ago

      If they already, essentially, cut off API access then it’s not a big leap to limit access on the web to logged in users only and rate limit or ban accounts that behave like scrapers.

    • @ColeSloth@discuss.tchncs.de
      link
      fedilink
      English
      21 year ago

      No. I can. Reddit was bought out, uses volunteers to control all the subs but forcefully removes you from the sub you created and were supposed to have control over if you didn’t play by their ever-changing rules, ruined/eliminates third party apks by demanding WAY over ad revenue profits to have access to api with a very short notice, and shadow banned anyone and everyone in a position to do anything about any of it. It’s a corporation that gutted an entire platform in order to push agendas they want and milk as much money out of it as possible. Hell, it’s the entire reason all of lemmy gets more than 30 posts a day. So many people switched to lemmy over the past year. They ruined a website I enjoyed and I’d rather them not make more money from the thousands of posts I made from over a decade of being there.

  • @Bobmighty@lemmy.world
    link
    fedilink
    English
    251 year ago

    With reddits severe bot problem, it’ll be like training on unfiltered sewage. Garbage in, garbage out.

  • SVcrossDO
    link
    fedilink
    English
    231 year ago

    Damn it. I haven’t deleted my account due to how many people I’ve supported and helped, I stopped using it while ago. It seems I’ll have to.

    • FaceDeer
      link
      fedilink
      11 year ago

      I’m kind of puzzled by this mindset. You were pleased with supporting and helping people before, but now supporting and helping is bad?

      • SVcrossDO
        link
        fedilink
        English
        11 year ago

        I’m happy that everyone has the support, but not that some specific AI can monetize that same support. I left on my Reddit account ways to contact me (including Lemmy). I helped others so good vibes could reach them, not for making the rich richer.

        • FaceDeer
          link
          fedilink
          11 year ago

          Fortunately there are a lot of open source models these days too.

  • @Yokozuna@lemmy.world
    link
    fedilink
    English
    231 year ago

    Good thing I scrubbed all of my posts and comments that I could. Fuck that site, straight up and down.

        • @meat_popsicle@sh.itjust.works
          link
          fedilink
          English
          101 year ago

          Thanks to federation, the copies of the eggs are. You can’t stop one instance from selling data sourced from federated content until it’s too late.

      • @drathvedro@lemm.ee
        link
        fedilink
        English
        71 year ago

        You can’t put a price tag on it. Nothing is stopping anyone from scraping all of the data for free.

      • @MostlyGibberish@lemm.ee
        link
        fedilink
        English
        61 year ago

        The only thing stopping them is the fact that anyone who wants the data can just utilize the federation protocol to take any data they want, and there’s not a lot anyone can do about it. You can’t sell something that’s trivial to get for free.

        If the question you’re really asking is “what’s stopping content on Lemmy/Mastodon/etc from being used to train an LLM?” the answer is, nothing.

      • @Toneswirly@lemmy.world
        link
        fedilink
        English
        51 year ago

        mass user exodus to one of the many other identical Instances. Also, data brokers prolly aren’t interested in going after each Instance because no one instance has enough data to make it worthwhile. Yet again, the fediverse proves its resistance to enshitification.

      • @Ilgaz@lemm.ee
        link
        fedilink
        English
        11 year ago

        I wished they had evil lawyers looking after such stuff and sold strictly opt in data to AI corps. Free for FOSS though.

    • Fake4000
      link
      fedilink
      English
      361 year ago

      You signed it all away the moment you scrolled down that EULA 😂

      • admiralteal
        link
        fedilink
        28
        edit-2
        1 year ago

        Can’t wait for the day a major court declares EULAs universally nonbinding outside of the most common-sense terms. Even though I doubt it will ever happen.

        “We can store and display your content and use stuff you publicly post as examples in advertisements for our platform” is pretty common sense.

        “We can use the things you post to do complex data analytics to package and sell your identity to advertisers” is fucking sus.

        “We can use the things you post to train ANN generative systems to build next-generation technologies to impersonate you and your peers” is simply nuts.

        The idea that displaying an EULA with an “agree” button is informed consent is just preposterous. Even lawyers don’t read them.

        • @Shdwdrgn@mander.xyz
          link
          fedilink
          English
          41 year ago

          Seems like it would never stand up in court. Prove that -I- agreed to anything. To do that, you first have to prove that nobody has ever created an account under my name, and more importantly, prove that Reddit accounts have never been hacked and that the person who clicked the button was even in my household. And if they keep that extensive of records to where they can follow every action taken by every user on the platform, it also implies that they are tracking my personal actions even before I agreed to anything.

          On the other hand, do they actually have a EULA? It’s been almost 14 years since I created my account, and there certainly wasn’t anything about selling my data for AI training when I signed up. If they change the terms of service, they are responsible for notifying everyone, otherwise they can’t claim that anyone agreed to these changes.

          I’m sure their lawyers could weasel their way through it some how, but it still seems to come down to them claiming they changed the agreement without notification but the users should still be legally bound by the new terms?

    • FaceDeer
      link
      fedilink
      -81 year ago

      The classic “screw everyone else, I want mine.”

      What fraction of a penny do you think you’re owed?