• Admittedly, I know little of AI. However, once companies can no longer increase profit with AI, they will use it to save costs instead. This will inevitably lead to mass layoffs, not because AI will correctly determine where to maximize revenue, but because executives don’t understand how how AI works, and they don’t understand how their employees contribute to their revenue.

    • @Zaktor@sopuli.xyz
      link
      fedilink
      English
      12 years ago

      It’ll also do the maximizing revenue sort of layoffs, which are also a really bad thing in a society where basic necessities are tied to employment. The execs will also fuck up a bunch in humorous ways, but that’s nothing more than a comforting distraction from the real and present danger automation of this level presents to a society built around employment.

  • NotAPenguin
    link
    fedilink
    42 years ago

    The article doesn’t explain how that’s the case at all.

    Aren’t all the big AI models trained on publicly available data?

    • Hot Saucerman
      cake
      link
      fedilink
      English
      1
      edit-2
      2 years ago

      Books3 is the definition of “not publicly available” because it’s all from pirated material downloaded from private torrent tracker Bibliotik.

      Books3 is literally why several of AI groups are being sued by various authors like Sarah Silverman and George R.R. Martin.

      Books3 was always illicitly obtained material which put into question whether an LLM using it could really fall under Fair Use. (It most likely does, but it’s still a legal question that hasn’t been answered yet.)

      Books3 Link: https://huggingface.co/datasets/the_pile_books3

      Books3 Description from Link:

      This dataset is Shawn Presser’s work and is part of EleutherAi/The Pile dataset.

      This dataset contains all of bibliotik in plain .txt form, aka 197,000 books processed in exactly the same way as did for bookcorpusopen (a.k.a. books1). seems to be similar to OpenAI’s mysterious “books2” dataset referenced in their papers. Unfortunately OpenAI will not give details, so we know very little about any differences. People suspect it’s “all of libgen”, but it’s purely conjecture.