Data poisoning: how artists are sabotaging AI to take revenge on image generators::As AI developers indiscriminately suck up online content to train their models, artists are seeking ways to fight back.

  • @gaiussabinus@lemmy.world
    link
    fedilink
    English
    481 year ago

    This system runs on the assumption that A) massive generalized scraping is still required B) You maintain the metadata of the original image C) No transformation has occurred to the poisoned picture prior to training(Stable diffusion is 512x512). Nowhere in the linked paper did they say they had conditioned the poisoned data to conform to the data set. This appears to be a case of fighting the last war.

  • Blaster M
    link
    fedilink
    English
    401 year ago

    Takes image, applies antialiasing and resize

    Oh, look at that, defeated by the completely normal process of preparing the image for training

  • @qooqie@lemmy.world
    link
    fedilink
    English
    251 year ago

    Unfortunately for them there’s a lot of jobs dedicated to cleaning data so I’m not sure if this would even be effective. Plus there’s an overwhelming amount of data that isn’t “poisoned” so it would just get drowned out if never caught

  • @Potatos_are_not_friends@lemmy.world
    link
    fedilink
    English
    211 year ago

    Imagine if writers did the same things by writing gibberish.

    At some point, it becomes pretty easy to devalue that content and create other systems to filter it.

    • @books@lemmy.world
      link
      fedilink
      English
      21 year ago

      I mean isn’t that eventually going to happen? Isn’t ai going to eventually learn and get trained from ai datasets and small issues will start to propagate exponentially?

      I just assume we have a clean dataset preai and messy gross dataset post ai… If it keeps learning from the latter dataset it will just get worse and worse, no?

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        31 year ago

        Not really. It’s like with humans. Without the occasional reality checks it gets weird, but what people chose to upload is a reality check.

        The pre-AI web was far from pristine, no matter how you define that. AI may improve matters by increasing the average quality.

    • @kromem@lemmy.world
      link
      fedilink
      English
      71 year ago

      Shhhhh.

      Let them keep doing the modern equivalent of “I do not consent for my MySpace profile to be used for anything” disclaimers.

      It keeps them busy on meaningless crap that isn’t actually doing anything but makes them feel better.

  • KᑌᔕᕼIᗩ
    link
    fedilink
    English
    161 year ago

    Artists and writers should be entitled to compensation for using their works to train these models, just like any other commercial use would. But, you know, strict, brutal free-market capitalism for us, not the mega corps who are using it because “AI”.

  • @kromem@lemmy.world
    link
    fedilink
    English
    11
    edit-2
    1 year ago

    This doesn’t actually work. It doesn’t even need ingestion to do anything special to avoid.

    Let’s say you draw cartoon pictures of cats.

    And your friend draws pointillist images of cats.

    If you and your friend don’t coordinate, it’s possible you’ll bias your cat images to look like dogs in the data but your friend will bias their images to look like horses.

    Now each of your biasing efforts become noise and not signal.

    Then you need to consider if you are also biasing ‘cartoon’ and ‘pointillism’ attributes as well, and need to coordinate with the majority of other people making cartoon or pointillist images.

    When you consider the number of different attributes that need to be biased for a given image and the compounding number of coordinations that would need to be made at scale to be effective, this is just a nonsense initiative that was an interesting research paper in lab conditions but is the equivalent of a mouse model or in vitro cancer cure being taken up by naturopaths as if it’s going to work in humans.

  • @RagingRobot@lemmy.world
    link
    fedilink
    English
    91 year ago

    So it sounds like they are taking the image data and altering it to get this to work and the image still looks the same just the data is different. So, couldn’t the ai companies take screenshots of the image to get around this?

  • @Kedly@lemm.ee
    link
    fedilink
    English
    -41 year ago

    Man, whenever I start getting tired by the amount of Tankies on Lemmy, the linux users and decent AI takes in users rejuvenates me. The rest of the internet has jumped full throttle on the AI hate train

    • @BURN@lemmy.world
      link
      fedilink
      English
      11 year ago

      The “AI hate train” is people who dislike being replaced by machines, forcing us further into the capitalist machine rather than enabling anyone to have a better life

      • @fruitycoder@sh.itjust.works
        link
        fedilink
        English
        71 year ago

        No disagreement, but it’s like hating water because the capitalist machine used to run water mills. It’s a tool, what we hate is the system and players working to entrench themselves and it. Should we be concerned about the people affected? Yes, of course, we always should have been, even before it was the “creative class” and white collar workers at risk. We should have been concerned when it was blue collar workers being automated or replaced by workers in areas with repressive regimes. We should have been concerned when it was service workers being increasingly turned into replaceable cogs.

        We should do something, but people are titling at windmills instead of the systems that oppress people. We should be pushing for these things to be public goods (open source like stability is aiming for, distributed and small models like Petals.dev and TinyML). We should be pushing for unions to prevent the further separation of workers from the fruits of their labor (look at the Writer’s Guild’s demands during their strike). We should be trying to only deal with worker and community cooperatives so that innovations benefit workers and the community instead of being used against them. And much more! It’s a lot, but it’s why I get mad about people wasting their time being made AI tools exist and raging against them instead of actually doing things to improve the root issues.

      • @General_Effort@lemmy.world
        link
        fedilink
        English
        21 year ago

        Not saying that there aren’t people like that, but this ain’t it. This tool specifically targets open source. The intention is to ruin things that aren’t owned and controlled by someone. A big part of AI hate is hyper-capitalist like that, though they know better than saying it openly.

        People hoping for a payout get more done than people just being worried or frustrated. So it’s hardly a surprise that they get most of the attention.

      • @Kedly@lemm.ee
        link
        fedilink
        English
        11 year ago

        Thing is, its capitalism thats our enemy, not the tech that is freeing us up from labour. Its not the tech thats the problem, its our society, and if fucking sucks that I’m just as poor as the rest of you, but because I finally have a tool that lets me satisfactorily lets me access my creativity, I’m being villianized by the art community, even though the tech I am using is open source and no capitalist is profiting off of me

  • @Sabin10@lemmy.world
    link
    fedilink
    English
    -91 year ago

    Data poisoning isn’t limited to just AI stuff and you should be doing it at every opportunity.

  • Dr. Moose
    link
    fedilink
    English
    -141 year ago

    Just don’t out your art to public if you don’t want someone/thing learn from it. The clinging to relevance and this pompous self importance is so cringe. So replacing blue collar work is ok but some shitty drawings somehow have higher ethical value?

    • @Red_October@lemmy.world
      link
      fedilink
      English
      101 year ago

      The idea that you would actually object to replacing labor with automation, but think replacing art with automation is fine, is genuinely baffling.

      • Dr. Moose
        link
        fedilink
        English
        -61 year ago

        Except the “art” ai is replacing is labor. This snobby ridiculous bullshit that some corporate drawings are somehow more important than other things is super cringe.

      • @Ilovethebomb@lemm.ee
        link
        fedilink
        English
        81 year ago

        Yeah, no. There’s a difference between posting your work for someone to enjoy, and posting it to be used in a commercial enterprise with no recompense to you.

        • Dr. Moose
          link
          fedilink
          English
          -31 year ago

          How are you going to stop that lol it’s ridiculous. Would you stop a corporate suit from viewing your painting because they might learn how to make a similar one? It’s makes absolutely zero sense and I can’t believe delulus online are failing to comprehend such simple concept of “computers being able to learn”.

          • Cyber Yuki
            link
            fedilink
            English
            21 year ago

            Ah yes, just because lockpickers can enter a house suddenly everyone’s allowed to break and enter. 🙄

          • @BURN@lemmy.world
            link
            fedilink
            English
            -61 year ago

            Computers can’t learn. I’m really tired of seeing this idea paraded around.

            You’re clearly showing your ignorance here. Computers do not learn, they create statistical models based on input data.

            A human seeing a piece of art and being inspired isn’t comparable to a machine reducing that to 1’s and 0’s and then adjusting weights in a table somewhere. It does not “understand” the concept, nor did it “learn” about a new piece of art.

            Enforcement is simple. Any output from a model trained on material that they don’t have copyright for is a violation of copyright against every artist who’s art was used illegally to train the model. If the copyright holders of all the training data are compensated and have opt-in agreed to be used for training then, and only then would the output of the model be able to be used.

            • @cm0002@lemmy.world
              link
              fedilink
              English
              -21 year ago

              they create statistical models based on input data.

              Any output from a model trained on material that they don’t have copyright for is a violation of copyright

              There’s no copyright violation, you said it yourself, any output is just the result of a statistical model and the original art would be under fair use derivative work (If it falls under copyright at all)

              • @BURN@lemmy.world
                link
                fedilink
                English
                21 year ago

                Considering most models can spit out training data, that’s not a true statement. Training data may not be explicitly saved, but it can be retrieved from these models.

                Existing copyright law can’t be applied here because it doesn’t cover something like this.

                It 100% should be a copyright infringement for every image generated using the stolen work of others.

                • @cm0002@lemmy.world
                  link
                  fedilink
                  English
                  21 year ago

                  You can get it to spit out something very close, maybe even exact depending on how much of your art was used in the training (Because that would make your style influence the weights and model more)

                  But that’s no different than me tracing your art or taking samples of your art to someone else and paying them to make an exact copy, in that case that specific output is a copyright violation. Just because it can do that, doesn’t mean every output is suddenly a copyright violation.

            • Dr. Moose
              link
              fedilink
              English
              -31 year ago

              It’s literally in the name. Machine learning. Ignorance is not an excuse.

              • @BURN@lemmy.world
                link
                fedilink
                English
                -3
                edit-2
                1 year ago

                That’s just one of the dumbest things I’ve heard.

                Naming has nothing to do with how the tech actually works. Ignorance isn’t an excuse. Neither is stupidity

      • Flying Squid
        link
        fedilink
        English
        41 year ago

        Are you actually suggesting that if I post a drawing of a dog, Disney should be allowed to use it in a movie and not compensate me?

        • @Delta_V@midwest.social
          link
          fedilink
          English
          01 year ago

          Everyone should be assumed to be able to look at it, learn from it, and add your style to their artistic toolbox. That’s an intrinsic property of all art. When you put it on display, don’t be surprised or outraged when people or AIs look at it.

          • @BURN@lemmy.world
            link
            fedilink
            English
            21 year ago

            AI does not learn and transform something like a human does. I have no problem with human artists taking inspiration, I do have a problem with art being reduced to a soulless generation that requires stealing real artists work to create something that isn’t original.

            • ASeriesOfPoorChoices
              link
              fedilink
              English
              11 year ago
              1. you don’t know how humans learn and transform something

              2. regardless, it does learn and transform something

            • @Delta_V@midwest.social
              link
              fedilink
              English
              0
              edit-2
              1 year ago

              AI does not learn and transform something like a human does.

              But they do learn. How human-like that learning may be isn’t relevant. A parrot learns to talk differently than a human does too, but African greys can still hold a conversation. Likewise, when an AI learns how to make art by studying what others have made, they may not do it in exactly the same way a human does it, but the products of the process are their own creations just as much as the creations of human artists that parrot other human artists’ styles and techniques.

        • @cm0002@lemmy.world
          link
          fedilink
          English
          -31 year ago

          Ofc not, that’s way different, that’s beyond the use of public use.

          If I browse to your Instagram, look at some of your art, record some numbers about it, observe your style and then leave that’s perfectly fine right? If I then took my numbers and observations from your art and everybody else’s that I looked and merged them together to make my own style that would also be fine right? Well that’s AI, that’s all it does on a simple level

          • Flying Squid
            link
            fedilink
            English
            31 year ago

            But they are still profiting off of it. Dall-E doesn’t make images out of the kindness of OpenAI’s heart. They’re a for-profit company. That really doesn’t make it different from Disney, does it?

            • @cm0002@lemmy.world
              link
              fedilink
              English
              01 year ago

              Sure, Dall-E has a profit motive, but then what about all the open source models that are trained on the same or similar data and artworks?

              • Flying Squid
                link
                fedilink
                English
                31 year ago

                You’ve strayed very far from:

                if you post publicly, expect it to be used publicly

                What is the difference between Dall-E scraping the art and an open source model doing it other than Dall-E making money at it? It’s still using it publicly.

                • @cm0002@lemmy.world
                  link
                  fedilink
                  English
                  01 year ago

                  I didn’t really stray far, you brought up that Dall-E has a profit motive and I acknowledged that yea that was true, but there also open source models that don’t