• @VintageGenious@sh.itjust.works
    link
    fedilink
    English
    563 months ago

    Because you’re using it wrong. It’s good for generative text and chains of thought, not symbolic calculations including math or linguistics

    • @Grandwolf319@sh.itjust.works
      link
      fedilink
      English
      223 months ago

      Because you’re using it wrong.

      No, I think you mean to say it’s because you’re using it for the wrong use case.

      Well this tool has been marketed as if it would handle such use cases.

      I don’t think I’ve actually seen any AI marketing that was honest about what it can do.

      I personally think image recognition is the best use case as it pretty much does what it promises.

      • @scarabic@lemmy.world
        link
        fedilink
        English
        -13 months ago

        Really? AI has been marketed as being able to count the r’s in “strawberry?” Please link to this ad.

      • @L3s@lemmy.worldM
        link
        fedilink
        English
        16
        edit-2
        3 months ago

        Writing customer/company-wide emails is a good example. “Make this sound better: we’re aware of the outage at Site A, we are working as quick as possible to get things back online”

        Dumbing down technical information “word this so a non-technical person can understand: our DHCP scope filled up and there were no more addresses available for Site A, which caused the temporary outage for some users”

        Another is feeding it an article and asking for a summary, https://hackingne.ws does that for its Bsky posts.

        Coding is another good example, “write me a Python script that moves all files in /mydir to /newdir”

        Asking for it to summarize a theory or protocol, “explain to me why RIP was replaced with RIPv2, and what problems people have had since with RIPv2”

        • @Corngood@lemmy.ml
          link
          fedilink
          English
          213 months ago

          Make this sound better: we’re aware of the outage at Site A, we are working as quick as possible to get things back online

          How does this work in practice? I suspect you’re just going to get an email that takes longer for everyone to read, and doesn’t give any more information (or worse, gives incorrect information). Your prompt seems like what you should be sending in the email.

          If the model (or context?) was good enough to actually add useful, accurate information, then maybe that would be different.

          I think we’ll get to the point really quickly where a nice concise message like in your prompt will be appreciated more than the bloated, normalised version, which people will find insulting.

          • @L3s@lemmy.worldM
            link
            fedilink
            English
            11
            edit-2
            3 months ago

            Yeah, normally my “Make this sound better” or “summarize this for me” is a longer wall of text that I want to simplify, I was trying to keep my examples short. Talking to non-technical people about a technical issue is not the easiest for me, AI has helped me dumb it down when sending an email, and helps correct my shitty grammar at times.

            As for accuracy, you review what it gives you, you don’t just copy and send it without review. Also you will have to tweak some pieces that it gives out where it doesn’t make the most sense, such as if it uses wording you wouldn’t typically use. It is fairly accurate though in my use-cases.

            Hallucinations are a thing, so validating what it spits out is definitely needed.

            Another example: if you feel your email is too stern or gives the wrong tone, I’ve used it for that as well. “Make this sound more relaxed: well maybe if you didn’t turn off the fucking server we wouldn’t of had this outage!” (Just a silly example)

            • @otp@sh.itjust.works
              link
              fedilink
              English
              143 months ago

              As for accuracy, you review what it gives you, you don’t just copy and send it without review.

              Yeah, I don’t get why so many people seem to not get that.

              It’s like people who were against Intellisense in IDEs because “What if it suggests the wrong function?”…you still need to know what the functions do. If you find something you’re unfamiliar with, you check the documentation. You don’t just blindly accept it as truth.

              Just because it can’t replace a person’s job doesn’t mean it’s worthless as a tool.

              • @Voroxpete@sh.itjust.works
                link
                fedilink
                English
                83 months ago

                The issue is that AI is being invested in as if it can replace jobs. That’s not an issue for anyone who wants to use it as a spellchecker, but it is an issue for the economy, for society, and for the planet, because billions of dollars of computer hardware are being built and run on the assumption that trillions of dollars of payoff will be generated.

                And correcting someone’s tone in an email is not, and will never be, a trillion dollar industry.

              • @Grandwolf319@sh.itjust.works
                link
                fedilink
                English
                63 months ago

                Yeah, I don’t get why so many people seem to not get that.

                The disconnect is that those people use their tools differently, they want to rely on the output, not use it as a starting point.

                I’m one of those people, reviewing AI slop is much harder for me than just summarizing it myself.

                I find function name suggestions useful cause it’s a lookup tool, it’s not the same as a summary tool that doesn’t help me find a needle in a haystack, it just finds me a needle when I have access to many needles already, I want the good/best needle, and it can’t do that.

            • @Voroxpete@sh.itjust.works
              link
              fedilink
              English
              63 months ago

              I think these are actually valid examples, albeit ones that come with a really big caveat; you’re using AI in place of a skill that you really should be learning for yourself. As an autistic IT person, I get the struggle of communicating with non-technical and neurotypical people, especially clients who you have to be extra careful with. But the reality is, you can’t always do all your communication by email. If you always rely on the AI to correct your tone or simplify your language, you’re choosing not to build an essential skill that is every bit as important to doing your job well as it is to know how to correctly configure an ACL on a Cisco managed switch.

              That said, I can also see how relying on the AI at first can be a helpful learning tool as you build those skills. There’s certainly an argument that by using tools, but paying attention to the output of those tools, you build those skills for yourself. Learning by example works. I think used in that way, there’s potentially real value there.

              Which is kind of the broader story with Gen AI overall. It’s not that it can never be useful; it’s that, at best, it can only ever aspire to “useful.” No one, yet, has demonstrated any ability to make AI “essential” and the idea that we should be investing hundreds of billions of dollars into a technology that is, on its best days, mildly useful, is sheer fucking lunacy.

              • snooggums
                link
                fedilink
                English
                43 months ago

                If you always rely on the AI to correct your tone or simplify your language, you’re choosing not to build an essential skill that is every bit as important to doing your job well as it is to know how to correctly configure an ACL on a Cisco managed switch.

                This is such a good example of how it AI/LLMs/whatever are being used as a crutch that is far more impactful than using a spellchecker. A spell checker catches typos or helps with unfamiliar words, but doesn’t replace the underlying skill of communicating to your audience.

          • @earphone843@sh.itjust.works
            link
            fedilink
            English
            23 months ago

            It works well. For example, we had a work exercise where we had to write a press release based on an example, then write a Shark Tank pitch to promote the product we came up with in the release.

            I gave AI the link to the example and a brief description of our product, and it spit out an almost perfect press release. I only had to tweak a few words because there were specific requirements I didn’t feed the AI.

            Then I told it to take the press release and write the pitch based on it.

            Again, very nearly perfect with only having to change the wording in one spot.

        • lurch (he/him)
          link
          fedilink
          English
          33 months ago

          it’s not good for summaries. often gets important bits wrong, like embedded instructions that can’t be summarized.

          • @L3s@lemmy.worldM
            link
            fedilink
            English
            0
            edit-2
            3 months ago

            My experience has been very different, I do have to sometimes add to what it summarized though. The Bsky account mentioned is a good example, most of the posts are very well summarized, but every now and then there will be one that isn’t as accurate.

        • snooggums
          link
          fedilink
          English
          1
          edit-2
          3 months ago

          The dumbed down text is basically as long as the prompt. Plus you have to double check it to make sure it didn’t have outrage instead of outage just like if you wrote it yourself.

          How do you know the answer on why RIP was replaced with RIPv2 is accurate and not just a load of bullshit like putting glue on pizza?

          Are you really saving time?

          • @L3s@lemmy.worldM
            link
            fedilink
            English
            2
            edit-2
            3 months ago

            Yes, I’m saving time. As I mentioned in my other comment:

            Yeah, normally my “Make this sound better” or “summarize this for me” is a longer wall of text that I want to simplify, I was trying to keep my examples short.

            And

            and helps correct my shitty grammar at times.

            And

            Hallucinations are a thing, so validating what it spits out is definitely needed.

            • snooggums
              link
              fedilink
              English
              -43 months ago

              How do you validate the accuracy of what it spits out?

              Why don’t you skip the AI and just use the thing you use to validate the AI output?

              • @L3s@lemmy.worldM
                link
                fedilink
                English
                -13 months ago

                Most of what I’m asking it are things I have a general idea of, and AI has the capability of making short explanations of complex things. So typically it’s easy to spot a hallucination, but the pieces that I don’t already know are easy to Google to verify.

                Basically I can get a shorter response to get the same outcome, and validate those small pieces which saves a lot of time (I no longer have to read a 100 page white paper, instead a few paragraphs and then verify small bits)

            • snooggums
              link
              fedilink
              English
              -2
              edit-2
              3 months ago

              If the amount of time it takes to create the prompt is the same as it would have taken to write the dumbed down text, then the only time you saved was not learning how to write dumbed down text. Plus you need to know what dumbed down text should look like to know if the output is dumbed down but still accurate.

      • @slaacaa@lemmy.world
        link
        fedilink
        English
        23 months ago

        I have it write for me emails in German. I moved there not too long ago, works wonders to get doctors appointment, car service, etc. I also have it explain the text, so I’m learning the language.

        I also use it as an alternative to internet search, which is now terrible. It’s not going to help you to find smg super location specific, but I can ask it to tell me without spoilers smg about a game/movie or list metacritic scores in a table, etc.

        It also works great in summarizing long texts.

        LLM is a tool, what matters is how you use it. It is stupid, it doesn’t think, it’s mostly hype to call it AI. But it definitely has it’s benefits.

      • @verdigris@lemmy.ml
        link
        fedilink
        English
        13 months ago

        I mean, I would argue that the answer in the OP is a good one. No human asking that question honestly wants to know the sum total of Rs in the word, they either want to know how many in “berry” or they’re trying to trip up the model.

      • @scarabic@lemmy.world
        link
        fedilink
        English
        13 months ago

        We have one that indexes all the wikis and GDocs and such at my work and it’s incredibly useful for answering questions like “who’s in charge of project 123?” or “what’s the latest update from team XYZ?”

        I even asked it to write my weekly update for MY team once and it did a fairly good job. The one thing I thought it had hallucinated turned out to be something I just hadn’t heard yet. So it was literally ahead of me at my own job.

        I get really tired of all the automatic hate over stupid bullshit like this OP. These tools have their uses. It’s very popular to shit on them. So congratulations for whatever agreeable comments your post gets. Anyway.

      • @Voyajer@lemmy.world
        link
        fedilink
        English
        14
        edit-2
        3 months ago

        This but actually. Don’t use an LLM to do things LLMs are known to not be good at. As tools various companies would do good to list out specifically what they’re bad at to eliminate requiring background knowledge before even using them, not unlike needing to somehow know that one corner of those old iPhones was an antenna and to not bridge it.

        • @sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          33 months ago

          Yup, the problem with that iPhone (4?) wasn’t that it sucked, but that it had limitations. You could just put a case on it and the problem goes away.

          LLMs are pretty good at a number of tasks, and they’re also pretty bad at a number of tasks. They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about. They’re pretty good at generating code, just don’t trust the code to be perfect.

          You wouldn’t use a chainsaw to build a table, but it’s pretty good at making big things into small things, and cleaning up the details later with a more refined tool is the way to go.

          • snooggums
            link
            fedilink
            English
            33 months ago

            They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about.

            That is called being terrible at summarizing.

            • @sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              43 months ago

              That depends on how you use it. If you need the information from an article, but don’t want to read it, I agree, an LLM is probably the wrong tool. If you have several articles and want go decide which one has the information you need, an LLM is a pretty good option.

      • @TheGrandNagus@lemmy.world
        link
        fedilink
        English
        103 months ago

        I think there’s a fundamental difference between someone saying “you’re holding your phone wrong, of course you’re not getting a signal” to millions of people and someone saying “LLMs aren’t good at that task you’re asking it to perform, but they are good for XYZ.”

        If someone is using a hammer to cut down a tree, they’re going to have a bad time. A hammer is not a useful tool for that job.

    • @Prandom_returns@lemm.ee
      link
      fedilink
      English
      03 months ago

      So for something you can’t objectively evaluate? Looking at Apple’s garbage generator, LLMs aren’t even good at summarising.

      • Balder
        link
        fedilink
        English
        2
        edit-2
        3 months ago

        For reference:

        AI chatbots unable to accurately summarise news, BBC finds

        the BBC asked ChatGPT, Copilot, Gemini and Perplexity to summarise 100 news stories and rated each answer. […] It found 51% of all AI answers to questions about the news were judged to have significant issues of some form. […] 19% of AI answers which cited BBC content introduced factual errors, such as incorrect factual statements, numbers and dates.

        It makes me remember I basically stopped using LLMs for any summarization after this exact thing happened to me. I realized that without reading the text, I wouldn’t be able to know whether the output has all the relevant info or if it has some made-up info.

  • @whotookkarl@lemmy.world
    link
    fedilink
    English
    433 months ago

    I’ve already had more than one conversation where people quote AI as if it were a source, like quoting google as a source. When I showed them how it can sometimes lie and explain it’s not a primary source for anything I just get that blank stare like I have two heads.

    • @schnurrito@discuss.tchncs.de
      link
      fedilink
      English
      113 months ago

      Me too. More than once on a language learning subreddit for my first language: “I asked ChatGPT whether this was correct grammar in German, it said no, but I read this counterexample”, then everyone correctly responded “why the fuck are you asking ChatGPT about this”.

  • @eggymachus@sh.itjust.works
    link
    fedilink
    English
    303 months ago

    A guy is driving around the back woods of Montana and he sees a sign in front of a broken down shanty-style house: ‘Talking Dog For Sale.’

    He rings the bell and the owner appears and tells him the dog is in the backyard.

    The guy goes into the backyard and sees a nice looking Labrador Retriever sitting there.

    “You talk?” he asks.

    “Yep” the Lab replies.

    After the guy recovers from the shock of hearing a dog talk, he says, “So, what’s your story?”

    The Lab looks up and says, “Well, I discovered that I could talk when I was pretty young. I wanted to help the government, so I told the CIA. In no time at all they had me jetting from country to country, sitting in rooms with spies and world leaders, because no one figured a dog would be eavesdropping, I was one of their most valuable spies for eight years running… but the jetting around really tired me out, and I knew I wasn’t getting any younger so I decided to settle down. I signed up for a job at the airport to do some undercover security, wandering near suspicious characters and listening in. I uncovered some incredible dealings and was awarded a batch of medals. I got married, had a mess of puppies, and now I’m just retired.”

    The guy is amazed. He goes back in and asks the owner what he wants for the dog.

    “Ten dollars” the guy says.

    “Ten dollars? This dog is amazing! Why on Earth are you selling him so cheap?”

    “Because he’s a liar. He’s never been out of the yard.”

  • @Grandwolf319@sh.itjust.works
    link
    fedilink
    English
    26
    edit-2
    3 months ago

    There is an alternative reality out there where LLMs were never marketed as AI and were marketed as random generator.

    In that world, tech savvy people would embrace this tech instead of having to constantly educate people that it is in fact not intelligence.

    • @Static_Rocket@lemmy.world
      link
      fedilink
      English
      53 months ago

      That was this reality. Very briefly. Remember AI Dungeon and the other clones that were popular prior to the mass ml marketing campaigns of the last 2 years?

  • @daniskarma@lemmy.dbzer0.com
    link
    fedilink
    English
    18
    edit-2
    3 months ago

    That happens when do you not understand what is a llm, or what its usecases are.

    This is like not being impressed by a calculator because it cannot give a word synonym.

  • @Grabthar@lemmy.world
    link
    fedilink
    English
    143 months ago

    Doc: That’s an interesting name, Mr…

    Fletch: Babar.

    Doc: Is that with one B or two?

    Fletch: One. B-A-B-A-R.

    Doc: That’s two.

    Fletch: Yeah, but not right next to each other, that’s what I thought you meant.

    Doc: Isn’t there a children’s book about an elephant named Babar.

    Fletch: Ha, ha, ha. I wouldn’t know. I don’t have any.

    Doc: No children?

    Fletch: No elephant books.

  • @dan1101@lemm.ee
    link
    fedilink
    English
    123 months ago

    It’s like someone who has no formal education but has a high level of confidence and eavesdrops on a lot of random conversations.

    • @interdimensionalmeme@lemmy.ml
      link
      fedilink
      English
      13 months ago

      Yes, nobody asking that question is wonderring about the “straw” part of the word. They’re asking, is the “berry” part one, or two "r"s

  • @Tgo_up@lemm.ee
    link
    fedilink
    English
    63 months ago

    This is a bad example… If I ask a friend "is strawberry spelled with one or two r’s"they would think I’m asking about the last part of the word.

    The question seems to be specifically made to trip up LLMs. I’ve never heard anyone ask how many of a certain letter is in a word. I’ve heard people ask how you spell a word and if it’s with one or two of a specific letter though.

    If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.

    • @Grandwolf319@sh.itjust.works
      link
      fedilink
      English
      23 months ago

      If you think of LLMs as something with actual intelligence you’re going to be very unimpressed

      Artificial sugar is still sugar.

      Artificial intelligence implies there is intelligence in some shape or form.

      • JohnEdwa
        link
        fedilink
        English
        3
        edit-2
        3 months ago

        Something that pretends or looks like intelligence, but actually isn’t at all is a perfectly valid interpretation of the word artificial - fake intelligence.

      • @corsicanguppy@lemmy.ca
        link
        fedilink
        English
        33 months ago

        Artificial sugar is still sugar.

        Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?

        Because those are all wrong. What’s your criteria?

        • @Grandwolf319@sh.itjust.works
          link
          fedilink
          English
          23 months ago

          In this example a sugar is something that is sweet.

          Another example is artificial flavours still being a flavour.

          Or like artificial light being in fact light.

      • @Tgo_up@lemm.ee
        link
        fedilink
        English
        13 months ago

        Exactly. The naming of the technology would make you assume it’s intelligent. It’s not.

      • @Scubus@sh.itjust.works
        link
        fedilink
        English
        03 months ago

        Thats because it wasnt originally called AI. It was called an LLM. Techbros trying to sell it and articles wanting to fan the flames started called it AI and eventually it became common dialect. No one in the field seriously calls it AI, they generally save that terms to refer to general AI or at least narrow ai. Of which an llm is neither.

  • Fubarberry
    link
    fedilink
    English
    63 months ago

    I asked mistral/brave AI and got this response:

    How Many Rs in Strawberry

    The word “strawberry” contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.

    • @jj4211@lemmy.world
      link
      fedilink
      English
      23 months ago

      Yes, at some point the meme becomes the training data and the LLM doesn’t need to answer because it sees the answer all over the damn place.

  • @humorlessrepost@lemmy.world
    link
    fedilink
    English
    6
    edit-2
    3 months ago

    Works fine for me in o3-mini-high:

    Counting letters in “strawberry”

    Alright, I’m checking: the word “strawberry” is spelled S T R A W B E R R Y. Let me count the letters: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). There are three R’s: in positions 3, 8, and 9. So, the answer is 3. Even if we ignore case, the count still holds. Therefore, there are 3 r’s in “strawberry.”

  • @Lazycog@sopuli.xyz
    link
    fedilink
    English
    33 months ago

    I can already see it…

    Ad: CAN YOU SOLVE THIS IMPOSSIBLE RIDDLE THAT AI CAN’T SOLVE?!

    With OP’s image. And then it will have the following once you solve it: “congratz, send us your personal details and you’ll be added to the hall of fame at CERN Headquarters”

  • @winkly@lemmy.world
    link
    fedilink
    English
    33 months ago

    How many strawberries could a strawberry bury if a strawberry could bury strawberries 🍓