• @spongebue@lemmy.world
      link
      fedilink
      16
      edit-2
      8 months ago

      Machine learning has some pretty cool potential in certain areas, especially in the medical field. Unfortunately the predominant use of it now is slop produced by copyright laundering shoved down our throats by every techbro hoping they’ll be the next big thing.

  • @ryathal@sh.itjust.works
    link
    fedilink
    288 months ago

    Both are happening. Samples of casual writing are more valuable to use to generate an article than research papers though.

  • @ImplyingImplications@lemmy.ca
    link
    fedilink
    218 months ago

    Because AI needs a lot of training data to reliably generate something appropriate. It’s easier to get millions of reddit posts than millions of research papers.

    Even then, LLMs simply generate text but have no idea what the text means. It just knows those words have a high probability of matching the expected response. It doesn’t check that what was generated is factual.

      • @ulkesh@beehaw.org
        link
        fedilink
        English
        18 months ago

        Because we have brains that are capable of critical thinking. It makes no sense to compare the human brain to the infancy and current inanity of LLMs.

  • @howrar@lemmy.ca
    link
    fedilink
    178 months ago

    I find it amusing that everyone is answering the question with the assumption that the premise of OP’s question is correct. You’re all hallucinating the same way that an LLM would.

    LLMs are rarely trained on a single source of data exclusively. All the big ones you find will have been trained on a huge dataset including Reddit, research papers, books, letters, government documents, Wikipedia, GitHub, and much more.

    Example datasets:

    • @andrewta@lemmy.world
      link
      fedilink
      48 months ago

      Rules of lemmy

      Ignore facts, don’t do research to see if the comment/post is correct, don’t look at other comments to see if anyone else has corrected the post/comment already, there is only one right side (and that is the side of the loudest group)

  • @TheOubliette@lemmy.ml
    link
    fedilink
    178 months ago

    “AI” is a parlor trick. Very impressive at first, then you realize there isn’t much to it that is actually meaningful. It regurgitates language patterns, patterns in images, etc. It can make a great Markov chain. But if you want to create an “AI” that just mines research papers, it will be unable to do useful things like synthesize information or describe the state of a research field. It is incapable of critical or analytical approaches. It will only be able to answer simple questions with dubious accuracy and to summarize texts (also with dubious accuracy).

    Let’s say you want to understand research on sugar and obesity using only a corpus from peer reviewed articles. You want to ask something like, “what is the relationship between sugar and obesity?”. What will LLMs do when you ask this question? Well, they will just attempt to do associations and to construct reasonable-sounding sentences based on their set of research articles. They might even just take an actual semtence from an article and reframe it a little, just like a high schooler trying to get away with plagiarism. But they won’t be able to actually mechanistically explain the overall mechanisms and will fall flat on their face when trying to discern nonsense funded by food lobbies from critical research. LLMs do not think or criticize. Of they do produce an answer that suggests controversy it will be because they either recognized diversity in the papers or, more likely, their corpus contains reviee articles that criticize articles funded by the food industry. But it will be unable to actually criticize the poor work or provide a summary of the relationship between sugar and obesity based on any actual understanding that questions, for example, whether this is even a valid question to ask in the first place (bodies are not simple!). It can only copy and mimic.

    • @howrar@lemmy.ca
      link
      fedilink
      1
      edit-2
      8 months ago

      Why does everyone keep calling them Markov chains? They’re missing all the required properties, including the eponymous Markovian property. Wouldn’t it be more correct to call them stochastic processes?

      Edit: Correction, turns out the only difference between a stochastic process and a Markov process is the Markovian property. It’s literally defined as “stochastic process but Markovian”.

        • @howrar@lemmy.ca
          link
          fedilink
          38 months ago

          Why settle for good enough when you have a term that is both actually correct and more widely understood?

                • @howrar@lemmy.ca
                  link
                  fedilink
                  18 months ago

                  That’s basically like saying that typical smartphones are square because it’s close enough to rectangle and rectangle is too vague of a term. The point of more specific terms is to narrow down the set of possibilities. If you use “square” to mean the set of rectangles, then you lose the ability to do that and now both words are equally vague.

    • @Melatonin@lemmy.dbzer0.comOP
      link
      fedilink
      18 months ago

      Surely that is because we make it do that. We cripple it. Could we not unbound AI so that it genuinely weighed alternatives and made value choices? Write self-improvement algorithms?

      If AI is only a “parrot” as you say, then why should there be worries about extinction from AI? https://www.safe.ai/work/statement-on-ai-risk#open-letter

      It COULD help us. It WILL be smarter and faster than we are. We need to find ways to help it help us.

      • @TheOubliette@lemmy.ml
        link
        fedilink
        08 months ago

        Surely that is because we make it do that. We cripple it. Could we not unbound AI so that it genuinely weighed alternatives and made value choices?

        It’s not that we cripple it, it’s that the term “AI” has been used as a marketing term for generative models using LLMs and similar technology. The mimicry is inherent to how these models function, they are all about patterns.

        A good example is “hallucinations” with LLMs. When the models give wrong answers because they appear to be making things up. Really, they are incapable of differentiating, they’re just producing sophisticated patterns from a very large models. There is no real underlying conceptualization or notion of true answers, only answers that are often true when the training material was true and the model captured the patterns and they were highly weighted. The hot topic for thevlast year has just been to augment these models with a more specific corpus, pike a company database, for a given application so that it is more biased towards relevant things.

        This is also why these models are bad at basic math.

        So the fundamental problem here is companies calling this AI as if reasoning is occurring. It is useful for marketing because they want to sell the idea that this can replace workers but it usually can’t. So you get funny situations like chatbots at airlines that offer money to people without there being any company policy to do so.

        If AI is only a “parrot” as you say, then why should there be worries about extinction from AI? https://www.safe.ai/work/statement-on-ai-risk#open-letter

        There are a lot of very intelligent academics and technical experts that have completely unrealistic ideas of what is an actual real-world threat. For example, I know one that worked on military drones, the kind that drop bombs on kids, that was worried about right wing grifters getting protested at a college campus like it was the end of the world. Not his material contribution to military domination and instability but whether a racist he clearly sympathized with would have to see some protest signs.

        That petition seems to be based on the ones against nuclear proliferation from the 80s. They could be simple because nuclear war was obviously a substantial threat. It still is but there is no propaganda fear campaign to keep the concern alive. For AI, it is in no way obvious what threat they are talking about.

        I have persobal concepts of AI threats. Having ridiculously high energy requirements compared to their utility when energy is still a major contributor to climate change. The potential for it to kill knowledge bases, like how it is making search engines garbage with a flood of nonsense websites. Enclosure of creative works and production by some monopoly “AU” companies. They are already suing others based on IP infringement when their models are all based on it! But I can’t tell if this petition is about that at all, it doesn’t explain. Maybe they’re thinking of a Terminator scenario, which is absurd.

        It COULD help us. It WILL be smarter and faster than we are. We need to find ways to help it help us.

        Technology is both a reflection and determinent of social relations. As we can see with this round if “AI”, it is largely vaporware that has not helped much with productivity but is nevertheless very appealing to businesses that feel they need to get on the hype train or be left behind. What they really want to do is have a smaller workforce so they can make more money that they can then use to make more money etc etc. For example, plenty of people use “AI” to generate questionably appealing graphics for their websites rather than paying an artist. So we can see that " A" tech is a solution searching for a problem, that its actual use cases are about profit over real utility, and that this is not the fault of the technology, but how we currently organize society: not for people, but for profit.

        So yes, of course, real AI could be very helpful! How nice would it be to let computers do the boring work and then enjoy the fruits of huge productivity increases? The real risk is not the technology, it is our social relations, who has power, and how technology is used. Is making the production of art a less viable career path an advancement? Is it helping people overall? What are the graphic designers displaced by what is basically an infinite pile of same-y stock images going to do now? They still have to have jobs to live. The fruits of “AI” removing much of their job market hasn’t really been shared equally, nor has it meant an early retirement. This is because the fundamental economic system remains in place and it cannot survive without forcing people to do jobs.

  • @Rampsquatch@sh.itjust.works
    link
    fedilink
    158 months ago

    You could feed all the research papers in the world to an LLM and it will still have zero understanding of what you trained it on. It will still make shit up, it can’t save the world.

  • Stepos Venzny
    link
    fedilink
    English
    138 months ago

    Training it on research papers wouldn’t make it smarter, it would just make it better at mimicking their writing style.

    Don’t fall for the hype.

  • @tiddy@sh.itjust.works
    link
    fedilink
    English
    88 months ago

    Papers are most importantly a documentation of exactly what and how a procedure was performed, adding a vagueness filter over that is only going to decrease its value infinitely.

    Real question is why are we using generative ai at all (gets money out of idiot rich people)

  • @schnurrito@discuss.tchncs.de
    link
    fedilink
    88 months ago

    Who is “we”? My understanding is LLMs are mostly being trained on a large amount of publicly available texts, including both reddit posts and research papers.

      • @thepreciousboar@lemm.ee
        link
        fedilink
        38 months ago

        Because “ai” ad we colloquially know today are language models: they train on and can produce language, that’s what they are designed on. Yes, they can produce images and also videos, but they don’t have any form of real knowledge or understanding, they only predict the next word or the next pixel based on their prompt and their vast examples of words and images. You can only talk to them because that’s what they are for.

        Feeding research papers will make it spit research-sounding words, which probably will contain some correct information, but at best an llm trained on that would be useful to search through existing research, it would not be able to make new one

      • @Alice@beehaw.org
        link
        fedilink
        18 months ago

        Because that’s what it’s designed for? I’m curious what else it could be good for. A machine capable of independent, intelligent research sounds like a totally different invention entirely.

        • @Melatonin@lemmy.dbzer0.comOP
          link
          fedilink
          18 months ago

          It’s sort of like the communication aspect of it isn’t the sole purpose of it. It’s as if we invented computers but the only thing we cared about was the monitor and the keyboard.

          We want it to DO things. Stick to the truth, not just placate.

          • @Alice@beehaw.org
            link
            fedilink
            18 months ago

            Didn’t realize that. The only applications I’ve seen for it are conversation or generating media based on text input. I thought all it did was analyze text and create a response based on patterns it had observed.

            I haven’t done much with it myself though so that’s probably a very limited POV.

  • Scott
    link
    fedilink
    English
    48 months ago

    Brain damage is cheaper than professionals

  • @RangerJosie@lemmy.world
    link
    fedilink
    38 months ago

    Saving the world isn’t profitable in the short term.

    Vulture capitalists don’t care about the future. They care about the immediate. Short term profitability. And nothing else.