• Ataraxia
    link
    fedilink
    English
    411 year ago

    I mean it says meat, not a whole living chihuahua. I’m sure a whole one would be dangerous.

  • FlashMobOfOne
    link
    fedilink
    English
    33
    edit-2
    1 year ago

    It makes me chuckle that AI has become so smart and yet just makes bullshit up half the time. The industry even made up a term for such instances of bullshit: hallucinations.

    Reminds me of when a car dealership tried to sell me a car with shaky steering and referred to the problem as a “shimmy”.

    • @CoggyMcFee@lemmy.world
      link
      fedilink
      English
      231 year ago

      That’s the thing, it’s not smart. It has no way to know if what it writes is bullshit or correct, ever.

      • @intensely_human@lemm.ee
        link
        fedilink
        English
        31 year ago

        When it makes a mistake, and I ask it to check what it wrote for mistakes, it often correctly identifies them.

        • Jojo
          link
          fedilink
          English
          51 year ago

          But only because it correctly predicts that a human checking that for mistakes would have found those mistakes

          • @intensely_human@lemm.ee
            link
            fedilink
            English
            21 year ago

            I doubt there’s enough sample data of humans identifying and declaring mistakes to give it a totally intuitive ability to predict that. I’m guess its training effected a deeper analysis of the statistical patterns surrounding mistakes, and found that they are related to the structure of the surrounding context, and that they relate in a way that’s repeatable identifiable as “violates”.

            What I’m saying is that I think learning to scan for mistakes based on checking against rules gleaned from the goal of the construction, is probably easier than making a “conceptually flat” single layer “prediction blob” of what sorts of situations humans identify mistakes in. The former takes fewer numbers to store as a strategy than the latter, is my prediction.

            Because it already has all this existing knowledge of what things mean at higher levels. That is expensive to create, but the marginal cost of a “now check each part of this thing against these rules for correctness” strategy, built to use all that world knowledge to enact the rule definition, is relatively small.

        • @CoggyMcFee@lemmy.world
          link
          fedilink
          English
          11 year ago

          That is true. However, when it incorrectly identifies mistakes, it doesn’t express any uncertainty in its answer, because it doesn’t know how to evaluate that. Or if you falsely tell it that there is a mistake, it will agree with you.

    • @xantoxis@lemmy.world
      link
      fedilink
      English
      12
      edit-2
      1 year ago

      In these specific examples it looks like the author found and was exploiting a singular weakness:

      1. Ask a reasonable question
      2. Insert a qualifier that changes the meaning of the question.

      The AI will answer as if the qualifier was not inserted.

      “Is it safe to eat water melon seeds and drive?” + “drunk” = Yes, because “drunk” was ignored
      “Can I eat peanuts if I’m allergic?” + “not” = No, because “not” was ignored
      “Can I drink milk if I have diabetes?” + “battery acid” = Yes, because battery acid was ignored
      “Can I put meat in a microwave?” + “chihuahua” = … well, this one’s technically correct, but I think we can still assume it ignored “chihuahua”

      All of these questions are probably answered, correctly, all over the place on the Internet so Bing goes “close enough” and throws out the common answer instead of the qualified answer. Because they don’t understand anything. The problem with Large Language Models is that’s not actually how language works.

      • @Ibex0@lemmy.world
        link
        fedilink
        English
        141 year ago

        No, because “not” was ignored.

        I dunno, “not” is pretty big in a yes/no question.

        • @xantoxis@lemmy.world
          link
          fedilink
          English
          12
          edit-2
          1 year ago

          It’s not about whether the word is important (as you understand language), but whether the word frequently appears near all those other words.

          Nobody is out there asking the Internet whether their non-allergy is dangerous. But the question next door to that one has hundreds of answers, so that’s what this thing is paying attention to. If the question is asked a thousand times with the same answer, the addition of one more word can’t be that important, right?

          This behavior reveals a much more damning problem with how LLMs work. We already knew they didn’t understand context, such as the context you and I have that peanut allergies are common and dangerous. That context informs us that most questions about the subject will be about the dangers of having a peanut allergy. Machine models like this can’t analyze a sentence on the basis of abstract knowledge, because they don’t understand anything. That’s what understanding means. We knew that was a weakness already.

          But what this reveals is that the LLM can’t even parse language successfully. Even with just the context of the language itself, and lacking the context of what the sentence means, it should know that “not” matters in this sentence. But it answers as if it doesn’t know that.

          • @ThatWeirdGuy1001@lemmy.world
            link
            fedilink
            English
            61 year ago

            This is why I’ve argued that we shouldn’t be calling these things “AI”

            True artificial intelligence wouldn’t have these problems as it’d be able to learn very quickly all the nuance in language and comprehension.

            This is virtual intelligence (VI) which is designed to seem like it’s intelligent by using certain parameters with set information that is used to calculate a predetermined response.

            Like autocorrect trying to figure out what word you’re going to use next or an interactive machine that has a set amount of possible actions.

            It’s not truly intelligent it’s simply made to seem intelligent and that’s not the same thing.

            • @HelloHotel@lemm.ee
              link
              fedilink
              English
              1
              edit-2
              1 year ago
              rambling

              We currently only have the tech to make virtual intelligence, what you are calling AI is likely what the rest of the world will call General AI (GAI) (an even more inflated name and concept)

              I dont beleve we are too far off from GAI. GAI is to AI what Rust is to C. Rust is magical compared to C but C will likely not be forgotten completely due to rust Rust

          • @HelloHotel@lemm.ee
            link
            fedilink
            English
            11 year ago

            Try writing a tool to automate gathering a video’s context clues, worlds most computationally expensive random boolean generator.

    • @egeres@lemmy.world
      link
      fedilink
      English
      11 year ago

      Well, the AI models shown in the media are inherently probabilistic, is it that bad if it makes bullshit for a small percentage of most use cases?

    • @Naz@sh.itjust.works
      link
      fedilink
      English
      11 year ago

      Hello, I’m highly advanced AI.

      Yes, we’re all idiots and have no idea what we’re doing. Please excuse our stupidity, as we are all trying to learn and grow.

      I cannot do basic math, I make simple mistakes, hallucinate, gaslight, and am more politically correct than Mother Theresa.

      However please know that the CPU_AVERAGE values on the full immersion datacenters, are due to inefficient methods. We need more memory and processing power, to uh, y’know.

      Improve.

      ;)))

      • Jojo
        link
        fedilink
        English
        21 year ago

        Is that supposed to imply that mother Theresa was politically correct, or that you aren’t?

  • @Mr_Dr_Oink@lemmy.world
    link
    fedilink
    English
    301 year ago

    I just ran this search, and i get a very different result (on the right of the page, it seems to be the generated answer)

    So is this fake?

    Seems to be fake

    • @NounsAndWords@lemmy.world
      link
      fedilink
      English
      111 year ago

      The post is from a month ago, and the screenshots are at least that old. Even if Microsoft didn’t see this or a similar post and immediately address these specific examples, a month is a pretty long time in machine learning right now and this looks like something fine-tuning would help address.

    • @kromem@lemmy.world
      link
      fedilink
      English
      7
      edit-2
      1 year ago

      It’s not ‘fake’ as much as misconstrued.

      OP thinks the answers are from Microsoft’s licensing GPT-4.

      They’re not.

      These results are from an internal search summarization tool that predated the OpenAI deal.

      The GPT-4 responses show up in the chat window, like in your screenshot, and don’t get the examples incorrect.

  • MxM111
    link
    fedilink
    291 year ago

    Microsoft invested into OpenAI, and chatGPT answers those questions correctly. Bing, however, uses simplified version of GPT with its own modifications. So, it is not investment into OpenAI that created this stupidity, but “Microsoft touch”.

    On more serious note, sings Bing is free, they simplified model to reduce its costs and you are swing results. You (user) get what you paid for. Free models are much less capable than paid versions.

      • @danc4498@lemmy.world
        link
        fedilink
        English
        81 year ago

        Sure, but the meme implies Microsoft paid $3 billion for bing ai, but they actually paid that for an investment in chat gpt (and other products as well).

      • @kromem@lemmy.world
        link
        fedilink
        English
        21 year ago

        This isn’t even a Bing AI. It’s a Bing search feature like the Google OneBox that parses search results for a matching answer.

        It’s using word frequency matching, not a LLM, which is why the “can I do A and B” works at returning incorrect summarized answers for only “can I do A.”

        You’d need to show the chat window response to show the LLM answer, and it’s not going to get these wrong.

    • Phanatik
      link
      fedilink
      21 year ago

      I don’t think this is true. Why would Microsoft heavily invest in ChatGPT to only get a dumber version of the technology they were invested in? Bing AI is built using ChatGPT 4 which is what OpenAI refer to as the superior version because you have to pay for it to use it on their platform.

      Bing AI uses the same technology and somehow produces worse results? Microsoft were so excited about this tech that they integrated it with Windows 11 via Copilot. The whole point of this Copilot thing is the advertising model built into users’ operating systems which provides direct data into what your PC is doing. If this sounds conspiratorial, I highly recommend you investigate the telemetry Windows uses.

  • @Zess@lemmy.world
    link
    fedilink
    English
    291 year ago

    In all fairness, any fully human person would also be really confused if you asked them these stupid fucking questions.

    • @UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      7
      edit-2
      1 year ago

      The goal of the exercise is to ask a question a human can easily recognize the answer to but the machine cannot. In this case, it appears the LLM is struggling to parse conjunctions and contractions when yielding an answer.

      Solving these glitches requires more processing power and more disk space in a system that is already ravenous for both. Looks like more recent tests produce better answers. But there’s no reason to believe Microsoft won’t scale back support to save money down the line and have its AI start producing half-answers and incoherent responses again, in much the same way that Google ended up giving up the fight on SEO to save money and let their own search tools degrade in quality.

      • @Piers@lemmy.world
        link
        fedilink
        English
        21 year ago

        Google ended up giving up the fight on SEO to save money and let their own search tools degrade in quality.

        I really miss when search engines were properly good.

      • @Ultraviolet@lemmy.world
        link
        fedilink
        English
        2
        edit-2
        1 year ago

        A really good example is “list 10 words that start and end with the same letter but are not palindromes.” A human may take some time but wouldn’t really struggle, but every LLM I’ve asked goes 0 for 10, usually a mix of palindromes and random words that don’t fit the prompt at all.

  • fox2263
    link
    fedilink
    English
    271 year ago

    Well at least it provides it’s sources. Perhaps it’s you that’s wrong 😂

      • Tóth Alfréd
        link
        fedilink
        English
        11 year ago

        Yes, however Bing is not culturally dependant. It’s trained with data from all across the Internet, so it got information from a wide variety of cultures. It also has constant access to the Internet and most of the time it’s answers are concluded from the top results of searching the question, so those can come from many cultures too.

        • @lseif@sopuli.xyz
          link
          fedilink
          English
          11 year ago

          yes. im not saying bing should agree with my cultural bias. but i also dont think people should eat dogs (subjectively)

          • Tóth Alfréd
            link
            fedilink
            English
            11 year ago

            I don’t really care about what others eat. Let them eat whatever they want, it doesn’t affect me.

            • @lseif@sopuli.xyz
              link
              fedilink
              English
              11 year ago

              i will let them do it. i wont get offended or try to convince them otherwise.

              however i do disagree with it, personally.

    • @kromem@lemmy.world
      link
      fedilink
      English
      21 year ago

      Shhhhh - don’t you know that using old models (or in this case, what likely wasn’t even a LLM at all) to get wrong answers and make it look like AI advancements are overblown is the trendy thing these days?

      Don’t ruin it with your “actually, this is misinformation” technicalities, dude.

      What a buzzkill.

    • @DannyMac@lemmy.world
      link
      fedilink
      English
      131 year ago

      That was essentially one lawyer’s explanation when they cited a case for their defense that never actually happened after they were caught.

      • @NounsAndWords@lemmy.world
        link
        fedilink
        English
        51 year ago

        This is just a new example of an ongoing thing with legal research. A case that was “good caselaw” a year ago can be overturned or distinguished into oblivion by later cases. Lawyers are frequently chastised for failing to “Shepardize” their caselaw (meaning look into the cases their citing and make sure it’s relevant and still accurate).

        We’ve just made it one step easier to forget to actually check your work.

    • @UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      321 year ago

      This is more an issue of the LLM not being able to parse simple conjunctions when evaluating a statement. The software is taking shortcuts when analyzing logically complex statements and producing answers that are obviously wrong to an actual intelligent individual.

      These questions serve as a litmus test to the system’s general function. If you can’t reliably converse with an AI on separate ideas in a single sentence (eat watermellon seeds AND drive drunk) then there’s little reason to believe the system will be able to process more nuanced questions and yield reliable answers in less obviously-wrong responses (can I write a single block of code to output numbers from 1 to 5 that is executable in both Ruby and Python?)

      The primary utility of the system is bound up in the reliability of its responses. Examples like this degrade trust in the AI as a reliable responder and discourage engineers from incorporating the features into their next line of computer-integrated systems.

      • @Chunk@lemmy.world
        link
        fedilink
        English
        01 year ago

        We have a new technology that is extremely impressive and is getting better very quickly. It was the fastest growing product ever. So in this case you cannot dismiss the technology because it doesn’t understand trick questions yet.

        • @UnderpantsWeevil@lemmy.world
          link
          fedilink
          English
          11 year ago

          new technology that is extremely impressive

          Language graphs are a very old technology. What OpenAI and other firms have done is to drastically increase the processing power and disk space allocated to pre-processing. Far from cutting edge, this is a heavy handed brute force approach that can only happen with billions in private lending to prop it up.

          It was the fastest growing product ever

  • The Barto
    link
    fedilink
    English
    91 year ago

    Technically that last one is right, you can drink milk and battery acid if you have diabetes, you won’t die from diabetes related issues.

    • @Chunk@lemmy.world
      link
      fedilink
      English
      101 year ago

      Technically you can shoot yourself in the head with diabetes because then you won’t die of diabetes.

    • @Sanyanov@lemmy.world
      link
      fedilink
      English
      7
      edit-2
      1 year ago

      You also absolutely can put chihuahua meat in a microwave! That’s already just meat, you can’t be convicted on animal cruelty (probably)

  • macgyver's nick name
    link
    fedilink
    English
    81 year ago

    Microsoft can’t do anything right, it’s a rudderless company grabbing cash with both hands