Quite frequently I come across scanned books that are viewable for free online. For example, the publisher put them there (such as preview chapters), a library (old books from their collection that are in public domain), etc. Since I like hoarding data, and the online viewers that are used to present the book to me might not be very practical, I frequently try to download the books one way or another. This requires toying with the “inspect element” tool and various other methods of getting the images/PDF. Now, all that I access is what is, well, accessible; I don’t hack into the servers or something. But - the stuff is meant to be hidden from the normal user. Does that act of hiding the material, no matter how primitive and easily circumvented, mean that I’m not allowed to access it at all?

I suppose ripping a public domain book is no big deal, but would books under copyright fare differently?

Mainly I’m asking out of curiosity, I don’t expect the police to come visit me for ripping a 16th century dictionary.

Note: I live in EU, but I’d be curious to hear how this is treated elsewhere too.

Edit: I also remembered a funny trick I noticed on one site - it allows viewing PDFs on their website, but not downloading, unless you pay for the PDF. But when you load the page, even without paying, the PDF is already downloaded onto your computer and can be found in the browser cache. Is it legal to simply save the file that is already on your computer?

  • @simple@lemm.ee
    link
    fedilink
    488 months ago

    AFAIK web scraping (the act of grabbing and downloading any data you see available on the internet) isn’t illegal, and I would assume downloading PDFs provided to you online would fall under that. Since it is copyrighted it would probably be illegal to share it, though.

    • @nvermind@lemm.ee
      link
      fedilink
      18
      edit-2
      8 months ago

      This. In a case around LinkedIn courts ruled that in the US it’s legal to scrape publicly available data. The company doing the scraping was selling that data to corporate customers, but ultimately use might depend on the information you’re accessing and under what permissions. (Not a lawyer)

      • @papalonian@lemmy.world
        link
        fedilink
        18 months ago

        If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine, actually using those links to download or share copyrighted content is what’s illegal. It’d be like buying the stuff to make a bomb or drugs, but then not making any bombs or drugs.

        That being said, while not necessarily illegal, I wouldn’t want authorities to find my bomb and drug ingredients, or my scraped piracy links, as I’d probably have some 'splainin to do.

        (Not a lawyer)

          • @papalonian@lemmy.world
            link
            fedilink
            18 months ago

            Who said that you can’t scrape content from the other place?

            If you scraped a pirate site and stored a bunch of links to copyrighted content you’d probably be fine,

            If you’re referring to the last line, I say I wouldn’t want authorities to find it because I don’t want to have to explain it. I’m 99% sure someone would not just store links to a bunch of pirated content for fun, they probably have accessed said pirated content, now you have to explain to the authorities why you have links to pirated content without implicating yourself in copyright infringement.

            Like I said, probably fine, I just wouldn’t want the hassle if I somehow got caught.

              • @papalonian@lemmy.world
                link
                fedilink
                18 months ago

                Sorry man, I’m not exactly sure what you’re asking.

                If you are able to load the content on your computer without infringing copyright laws, you’re allowed to circumvent whatever the website has in place to store whatever data you would like from whatever website you would like, regardless of the nature of the site, so long as the content is legal (is not CP) and again not being presented in a way that infringes aforementioned copyright laws.

                If you’re asking why the copyright laws exist, I can’t really help you with that one.

  • @Vipsu@lemmy.world
    link
    fedilink
    English
    398 months ago

    According to the big tech its ok if you’re training large language model with it.

    • @lugal@lemmy.world
      link
      fedilink
      148 months ago

      You’re confusing the law that applies for the ruling class with the one that applies to common people

      • @Mango@lemmy.world
        link
        fedilink
        28 months ago

        There’s a law for the ruling class? I always figured they gotta just cut their political buddies in.

      • @SlothMama@lemmy.world
        link
        fedilink
        48 months ago

        Unironically yes, you would not know who Spiderman was without viewing a copyrighted work demonstrating what he looks like, and now you understand while generative AI fundamentally has to ingest copyrighted works.

  • slazer2au
    link
    fedilink
    English
    268 months ago

    As with everything with the law, it depends.

    In Australia, distribution is the illegal part, seeding/sharing is where they get you. Not the actual download itself.

  • @The_v@lemmy.world
    link
    fedilink
    78 months ago

    Not an expert, but in the U.S. making a copy of a broadcast for personal use is legal under fair-use. Anything that loads up on your computer screen you can make a copy and save it for personal use. So screen captures are by definition legal.

    How exactly you copy the material on your screen gets tricky under the DMCA clusterfuck. Breaking encryption to copy the material is illegal unless there is an valid exception for fair-use. What exactly those valid exceptions are is above my paygrade.

  • @Excrubulent@slrpnk.net
    link
    fedilink
    English
    7
    edit-2
    8 months ago

    I’d say if the copyright holder says you’re not allowed to then you’re not. It’s piracy.

    People will tell you that you’ve already downloaded the data so saving it is fundamentally, technically no different, but that doesn’t matter to the law, it’s still piracy.

    Like yeah, it’s absurd and pointless and anti-consumer and anti-knowledge and unenforceable and unsustainable, but that’s copyright. It’s always been that way.

    Copyright destroys culture and piracy is our ethical duty in the face of that. The only reason to care about it is so you don’t get caught.

  • @Etterra@lemmy.world
    link
    fedilink
    68 months ago

    It might be illegal to post it without permission, but you can download it all you damn well please and they can’t stop you. Unless it’s like government top secret something or other. In that case you probably don’t want it anywhere near your computer and should probably tell somebody where you found it.

      • @steeznson@lemmy.world
        link
        fedilink
        18 months ago

        Astonishing listening to the news coverage of that story where the anchors were reading some terminally online nonsense from the teleprompter about Discord “Thug Shakers”

  • @schnurrito@discuss.tchncs.de
    link
    fedilink
    58 months ago

    If it’s in the public domain, it’s almost certainly legal. I don’t have the general answer to your question.

    Really this question shows how outdated copyright law is; in many countries it prohibits “copying”, but in the age of computers nearly all accessing of information involves “copying” it in some way.

  • Wild Bill
    link
    fedilink
    48 months ago

    Mind posting a guide on how you tinker with those inspect element tools?

    • @Excrubulent@slrpnk.net
      link
      fedilink
      English
      2
      edit-2
      8 months ago

      Right click -> inspect element (Q) works.

      You can also press F12.

      And if right click is blocked, on Firefox holding SHIFT will unblock right click. There is also a plugin that does this for you.

      Often websites will put an invisible element in front of the content to intercept this trick, but you can navigate through the elements to find the one they were trying to obfuscate.

      • @Buddahriffic@lemmy.world
        link
        fedilink
        28 months ago

        Also you can just block elements you right click on in Firefox (though this might be an option added by an add-on). If there’s hidden elements you just need to go through each of those until you can click on the one you want directly (and you can tell by what is highlighted in the inspect element mode).

        You can also hit delete in inspect element mode to remove that element. You can also edit whatever you want in the element. Makes me wish it existed back when I was doing more web dev work, would have made things a lot easier when debugging.

    • @antonim@lemmy.dbzer0.comOP
      link
      fedilink
      28 months ago

      (Sorry for the late response.) Well it depends a lot on the site. Since I focus on books and scholarly articles, the ideal way is to find the URL of the original PDF. The website might show you just individual pages as images, but it might hide the link to the PDF somewhere in the code. Alternatively, you might just obtain all the URLs of the individual page images, put them all into a download manager, and later bundle them all into a new PDF. (When you open the “inspect element” window, you just have to figure out which part of the code is meant to display the pages/images to you.) Sometimes the PDFs and page images can be found in your browser cache, as I mention in the OP. There’s quite some variety among the different sites, but with even the most rudimentary knowledge of web design you should be able to figure out most of them.

      If need help with ripping something in particular, DM me and I’ll give it a try.

  • AlphaOmega
    link
    fedilink
    English
    48 months ago

    Everything on the Internet can be downloaded, copied etc

    • @accideath@lemmy.world
      link
      fedilink
      28 months ago

      That sadly isn’t true everywhere. Here in Germany (and I suspect large parts of the EU) downloading/streaming copyrighted content without license used to be a grey area but has been completely illegal for a few years now.

      Of course, VPNs are perfectly legal.