Researchers claim GPT-4 passed the Turing test

@vegeta@lemmy.world · 1 year ago

Researchers claim GPT-4 passed the Turing test

NutWrench · 1 year ago

Each conversation lasted a total of five minutes. According to the paper, which was published in May, the participants judged GPT-4 to be human a shocking 54 percent of the time. Because of this, the researchers claim that the large language model has indeed passed the Turing test.

That’s no better than flipping a coin and we have no idea what the questions were. This is clickbait.

@Hackworth@lemmy.world · 1 year ago

On the other hand, the human participant scored 67 percent, while GPT-3.5 scored 50 percent, and ELIZA, which was pre-programmed with responses and didn’t have an LLM to power it, was judged to be human just 22 percent of the time.

54% - 67% is the current gap, not 54 to 100.

NutWrench · 1 year ago

The whole point of the Turing test, is that you should be unable to tell if you’re interacting with a human or a machine. Not 54% of the time. Not 60% of the time. 100% of the time. Consistently.

They’re changing the conditions of the Turing test to promote an AI model that would get an “F” on any school test.

@bob_omb_battlefield@sh.itjust.works · 1 year ago

But you have to select if it was human or not, right? So if you can’t tell, then you’d expect 50%. That’s different than “I can tell, and I know this is a human” but you are wrong… Now that we know the bots are so good, I’m not sure how people will decide how to answer these tests. They’re going to encounter something that seems human-like and then essentially try to guess based on minor clues… So there will be inherent randomness. If something was a really crappy bot then it wouldn’t ever fool anyone and the result would be 0%.

@dustyData@lemmy.world · 1 year ago

No, the real Turing test has a robot trying to convince an interrogator that they are a female human, and a real female human trying to help the interrogator to make the right choice. This is manipulative rubbish. The experiment was designed from the start to manufacture these results.

@BrianTheeBiscuiteer@lemmy.world · 1 year ago

It was either questioned by morons or they used a modified version of the tool. Ask it how it feels today and it will tell you it’s just a program!

@KairuByte@lemmy.dbzer0.com · 1 year ago

The version you interact with on their site is explicitly instructed to respond like that. They intentionally put those roadblocks in place to prevent answers they deem “improper”.

If you take the roadblocks out, and instruct it to respond as human like as possible, you’d no longer get a response that acknowledges it’s an LLM.