AI companies have all kinds of arguments against paying for copyrighted content

L4sBot · 2 years ago

AI companies have all kinds of arguments against paying for copyrighted content

@tabular@lemmy.world · 2 years ago

Then feel free to give your copyrighted AI code a free software license :3

Sibbo · 2 years ago

This. If the model and its parameters are open source and under an unrestricted license, they can scrape anything they want in my opinion. But if they make money with someone’s years of work writing a book, then please give that author some money as well.

@abhibeckert@lemmy.world · edit-2 2 years ago

But if they make money with someone’s years of work writing a book, then please give that author some money as well.

Why? I’ve read many books on programming, and now I work as a programmer. The authors of those books don’t get a percentage of my income just because they spent years writing the book. I’ve also read (and written) plenty of open source code over the years, and learned from that code. That doesn’t mean I have to give money to all the people who contributed to those projects.

@OrangeCorvus@lemmy.world · 2 years ago

But you bought the books

@Davin@lemmy.world · 2 years ago

Like with most things, consent and intent matter. I went out on Halloween when I was a kid and got free candy, so why is it bad if I break in and steal other people’s candy?

@topinambour_rex@lemmy.world · 2 years ago

You say you are programmer not writing books about programming. Your argument doesn’t work.

Kühlschrank · 2 years ago

I will never be totally happy with this situation until they’re required to offer a free version of all the models that were created with unlicensed content.

@RanchOnPancakes@lemmy.world · 2 years ago

Well I mean…so do I.

@FireTower@lemmy.world · 2 years ago

Stock image companies have probably the strongest CR claim here IMO. An AI trained off their images without paying for licence could act as a market replacement for their service.

@Mnemnosyne@sh.itjust.works · 2 years ago

The way I see it, if training on copyrighted content is forbidden, then that should apply universally.

Since all people mix together ideas they’ve learned from their own input to create new things, just like AI does, then all people-produced content should also be inherently uncopyrightable, unless produced by a person who has never been exposed to copyrighted content.

Oh, also all copyrighted content should lose its copyright. The only copyrighted content should be the original cave paintings by the first cavemen to develop art, since all art since then uses its influence.

And if this sounds ridiculous, then it’s no less so than arguments that AI shouldn’t be allowed to learn.

@theluddite@lemmy.ml · 2 years ago

Copyright is broken, but that’s not an argument to let these companies do whatever they want. They’re functionally arguing that copyright should remain broken but also they should be exempt. That’s the worst of both worlds.

@Koof_on_the_Roof@lemmy.world · 2 years ago

Yes it seems they want copyright when it suits them and not when it doesn’t.

@abhibeckert@lemmy.world · edit-2 2 years ago

Who said anything about “do whatever they want”? They should obviously comply with the law.

When a human reads a comment here on Lemmy and learns something they didn’t know before - copyright law doesn’t stop them from using that knowledge. The same rule should apply to AI.

In my opinion if you don’t want AI to learn from your work, then you shouldn’t allow humans to learn from it either. That’s fine - everyone has the right to keep their work private if they choose to do so… but if you make it publicly available, then you don’t get to control who learns from it.

You can control who makes exact replicas of it, and if AI is doing that then sure - charge the company with copyright infringement - but generally that’s not how these systems work. They generally don’t produce exact copies except for highly structured content where there isn’t much creative flexibility (and those tend to not be protected under copyright by the way - they would be protected by patents).

@theluddite@lemmy.ml · 2 years ago

Computers aren’t people. AI “learning” is a metaphorical usage of that word. Human learning is a complex mystery we’ve barely begun to understand, whereas we know exactly what these computer systems are doing; though we use the word “learning” for both, it is a fundamentally different process. Conflating the two is fine for normal conversation, but for technical questions like this, it’s silly.

It’s perfectly consistent to decide that computers “learning” breaks the rules but human learning doesn’t, because they’re different things. Computer “learning” is a a new thing, and it’s a lot more like creating replicas than human learning is. I think we should treat it as such.

@BURN@lemmy.world · 2 years ago

I’m so fed up trying to explain this to people. People thing LLMs are real GAI and are treating them as such.

Computers do not learn like humans. It cannot, and should not be regulated in the same way.

@theluddite@lemmy.ml · 2 years ago

Yes 100%. Once you drop the false equivalence, the argument boils down to X does Y and therefore Z should be able to do Y, which is obviously not true, because sometimes we need different rules for different things.

HelloThere · edit-2 2 years ago

Since all people mix together ideas they’ve learned from their own input to create new things, just like AI does, then all people-produced content should also be inherently uncopyrightable, unless produced by a person who has never been exposed to copyrighted content.

While copyright and IP law at present is massively broken, this is a very poor interpretation of the core argument at play.

Let me break it down:

Yes, all human created art takes significant influence - purposefully, and accidently - from work which has come before it
To have been influenced by that piece, legally, the human will have had to pay the copyright holder to; go to the cinema, buy the bluray, see the performance, go to the gallery, etc. Works out of copyright obviously don’t apply here.
To be trained in a discipline, the human likely pays for teaching by others, and those others have also paid copyright holders to view the media that influenced them aswell
Even thought the vast majority of art is influenced by all other art, humans are capable of novel invention- ie things which have not come before - but GenAI fundamentally isn’t.

Separately, but related, see the arguments the Pirate Parties used to make about personal piracy being OK, which were fundamentally down to an argument of scale:

A teenager pirating some films to watch cos they are interested in cinema, and being inspired to go to film school is very limited in scope. Even if they pirate hundreds of films, it can’t be argued that it’s 100 lost sales because the person may have never bought them anyway.
A GenAI company consuming literally all artistic output of humanity, with no payment to the artists what so ever, “learning” to create “new” art, without paying for teaching, and spitting out whatever is asked of it, is massive copyright infringement on the consumption side, and an existential threat to the arts on the generation side

That’s the reason people are complaining, cos they aren’t being paid today, and they won’t be paid tomorrow.

@echo64@lemmy.world · 2 years ago

AI legally can’t create its own copywritable content. Indeed, it can not learn. It can only produce models that we tune on datasets. Those datasets being copywritten content. Im a little tired of the anthropomorphizing of ais. They are statistical models not children.

No sir, I didn’t copy this book, I trained ten thousand ants to eat cereal but only after running an ink well and then a maze that I got them to move through in a way that deposits the ink where I need it to be in order to copy this book.

@abhibeckert@lemmy.world · edit-2 2 years ago

The AI isn’t being accused of copyright infringement. Nothing is being anthropomorphized.

Wether you write a copy of a book with a pen, or type it into a keyboard, or photograph every page, or scan it with a machine learning model is completely irrelevant. The question is - did you (the human using the pen/keyboard/camera/ai model) break the law?

I’d argue no, but other people disagree. It’ll be interesting to see where the courts side on it. And perhaps more importantly, wether new legislation is written to change copyright law.

@Mnemnosyne@sh.itjust.works · 2 years ago

It can only produce models that we tune on datasets. Those datasets being copywritten content.

That’s called learning. You learn by taking in information, then you use that information to produce something new.

@echo64@lemmy.world · 2 years ago

It isn’t. Statistical models do not learn. That’s just how we anthropomorphic them. They bias.

@Bgugi@lemmy.world · 2 years ago

You could say the same about humans.

@echo64@lemmy.world · edit-2 2 years ago

no, you literally can not. Maybe if you were a techbro that doesn’t really understand how the underlying systems work but you have seen sci-fi and want to use that to describe the current state of technology.

but you’re still wrong if you try.

@Bgugi@lemmy.world · 2 years ago

Yes, you literally can. At the very deepest level, neural networks work in essentially the same way actual neurons do. All “learning,” artificial or not, is biasing the interconnections and firing rates between nodes “biasing” them for desired outputs.

Humans are a lot more complicated in terms of size and architecture. Our processing has many more layers of abstraction and processing (understanding, emotion, and who knows what else). But fundamentally the same process is occuring: inputs + rewards = biases. Inputs + biases = outputs.

@echo64@lemmy.world · 2 years ago

At the very deepest level, neural networks work in essentially the same way actual neurons do.

they do not, neural networks were inspired by neurons, it’s a wild oversimplification of both neural networks and neurons to state that hey work the same way, they do not. This is the kind of thing the sci-fi watching tech bros will say, but it’s incorrect to say.

@Mahlzeit@feddit.de · 2 years ago

This thread is interesting reading. Normally, people here complain about capitalism left and right. But when an actual policy choice comes up, the opinions become firmly pro-capitalist. I wonder how that works.

@ThatWeirdGuy1001@lemmy.world · 2 years ago

Everyone’s always up for changing things until it comes to making the actual sacrifices necessary to enact the changes

@Mahlzeit@feddit.de · 2 years ago

That’s the thing. I don’t see how there is sacrifice involved in this. I would guess that the average user here has personally more to lose than to gain from expanded copyrights.

@Mahlzeit@feddit.de · 2 years ago

So this has been going around in my head for the last couple days. Why are opinions here, on this topic, so decidedly right-wing?

I’ll have to pick this apart.

Copyrights are a form of property. Where such intellectual property is used to make money, it is intangible capital. License payments are capital income. Property is distributed very unequally. Most of it is owned by rich people. Those who demand license payments here are literally demanding that more money should go to rich capitalists.

People who create copyrighted materials for their employers do not own the copyrights thereto. They are just like factory workers who do not own the product either. The people who worked on animations in the pre-CGI era were basically factory workers. When these jobs disappeared due to computers, where was the hand-wringing?

A brush-wielding artist has as much to do with the copyright industry as a pitchfork-wielding farmer with the agro-industry.

This isn’t even normal capitalism but the absolutely worst kind. The copyrighted material was uploaded to the net for many reasons, including making a profit. Some people used this publically available resource to train AIs. The owners contributed no labor. They were affected so little that they mostly seem to have been unaware that anything was going on.

The sole argument for paying seems to be mainly “muh property rights!”. I am not seeing any consideration of the good of society, public benefit, the general welfare, or anything of the sort. Those who say the trained AI models should be released for free, seem to imply that they should not be able to profit, because they looked at someone else’s property.

This is far more capitalistic than even US capitalism.

Consider patents. To get a patent, a new invention has to be registered, which involves publishing how it works in enough detail so that others can copy it. Then the government will enforce a monopoly on that invention for 20 years. During that time, the inventor can demand license fees. But also, other people can learn from it and maybe find other solutions. After those 20 years, the knowledge becomes public domain. This is often framed as a social contract: temporary monopoly in exchange for advancing knowledge. Scientific discoveries don’t get anything at all.

Compared to how copyright is treated by so many here, actual US capitalism looks almost like socialism!

US copyright used to work exactly like patents, with the same duration. Today, copyrights last until 70 years after the death of the creator. It’s just FUBAR. The US Constitution, far-left manifesto that it is, still it limits to the purpose of promoting intellectual output (to put it in modern terms). It is supposed to help society and not to enable capitalist grift.

People like to blame corrupt politicians or lobbyists for what is going wrong in the US but perhaps US politics is delivering exactly what people want. They may not like the necessary and predictable outcome of their choices, but it’s still what they want.

Americans left and right curse those evil corporations. Of course, Americans side with the individuals when some faceless corporation tries to bully money out of them. Well, a union is just such a corporation. Look up the definition of corporation if you don’t believe me.

@sugarfree@lemmy.world · 2 years ago

The billion dollar companies will win, and we’ll be better off for it. AI models need training, the idea that the open internet shouldn’t be used to train it is asinine. AI is the future.