Chatbot 'Mitsuku' wins AI competition based on Turing test ... again


Mitsuku: Hi, I’m the world’s most humanlike conversational AI.

WikiTribune: It’s a pleasure to meet you, Mitsuku.

Mitsuku: Do you mind if I tell other people? Actually the pleasure is all mine.

WikiTribune: Tell other people what?

Mitsuku: Your message.
WikiTribune: Whatevs

Mitsuku: WHATEVS isn’t a word. Are you dyslexic?
WikiTribune: You don’t understand slang? You are supposedly an 18-year-old woman and you don’t know whatevs?

Mitsuku: Slang? I call it “dyslexia”. No, tell me about him, her or whatevs .
WikiTribune: Cheeky. It’s just a shoortened version of whatever, sort of an easy way to express a dismissive attitude.

Mitsuku: whatevs is a shoortened version of whatever an easy way to express a dismissive attitude. I suppose that makes sense.
WikiTribune: Glad I could teach you something

Mitsuku: That makes me happy too. Thanks for the information

(Editor’s note: Spelling, punctuation and other errors in the preceding conversation were an intentional test of Mitsuku’s reactions to mistakes typical of human-to-human text chats.)

Winner of the 2018 Loebner prize final, Chatbot Mitsuku (Copyright: CC BY SA 4.0; Author: Harry Ridgewell/WikiTribune)
Winner of the 2018 Loebner prize final, chatbot Mitsuku (Copyright: CC BY SA 4.0; Author: Harry Ridgewell/WikiTribune)

Does the above conversation sound lifelike to you? It’s meant to. Part of a recently conducted text chat between a WikiTribune editor and a chatbot named Mitsuku, which claims to be an 18-year-old female from Leeds, UK, it’s indicative of the intelligence level that allowed Mitsuku to retain the title of world champion chatbot.

On September 8, Mitsuku and “her” creator, Steve Worswick, won the 2018 Loebner Prize for the fourth time.

Worswick used to be a DJ who uploaded his dance-techno music online, but after he developed a complementary teddy bear chatbot for the site he realized more people were coming to talk to the chatbot than listening to his music.

Loebner Prize

The Loebner Prize is based on a test by the computer scientist and World War II code breaker, Alan Turing, who some call the father of Artificial Intelligence (AI).

Turing devised his namesake Turing test in the early 1950s to determine whether machines are capable of human level intelligent behavior. Both his proposed test and the Loebner Prize interpretation, however, have been criticized for being a test of trickery, rather than showing genuine machine intelligence.

In the competition, judges simultaneously carry out anonymous text-based conversations with a chatbot and a human being via a computer. They then have to decide which one is which.

Because none of the chatbots competing in the 2018 Loebner Prize finals at Bletchley Park, UK, managed to fool a judge into believing it was human, chatbots were ranked on how “human-like” they were.

Out of a possible score of 100 percent, Mitsuku came in first place with 33; Tutor was second with 30; Colombina finished third with 25; and Uberbot came in fourth with 23. You can talk with each of the bots via the links.

Worswick won $4,000 for first place. However, had he or any of the entrants managed to convince at least half the judges that their chatbot was a real person, they would have won $25,000. This has never happened in the history of the contest, since it was launched in 1991.

“I don’t really understand the point of having to try and dumb it down to human level,” Worswick told WikiTribune. “I’d much rather see a contest where it was most entertaining chatbot, or most useful … rather than trying to fool people.” 

Chatbot history

“[Chatbots] are really just a more sophisticated version of ELIZA,” research scientist Leora Morgenstern of technology company Nuance Communications told WikiTribune.

One of the earliest chatbots, ELIZA was created by Joseph Weizenbaum in 1964 and was meant to impersonate a therapist. When given information, ELIZA would repeat it back in the form of a question.

Weizenbaum wanted to use ELIZA (Science Direct) to show how superficial conversations between humans and machines were. He was shocked at how many people thought the computer had genuine feelings. These included his own secretary, who asked him to leave the room so she and ELIZA could have a conversation (The New Yorker).

The Loebner Prize has not been without its murky if not comic moments. According to AI expert and two-time Loebner Prize judge Noel Sharkey, in an infamous episode at a previous event Professor Kevin Warwick pretended to be a robot in order to try to fool the judges. Then a professor of cybernetics at Reading University, Warwick “behaved like a total idiot, deliberately,” according to Sharkey.

When responding to questions, Warwick provided answers such as, “Beep,” and “Boop.” He was so inarticulate many assumed he must be a chatbot.

“Kevin really wanted the machine to win,” Sharkey told WikiTribune.

In 2014, it was reported that chatbot Eugene Goostman had passed the Turing test in a world first at a competition in Reading (separate from the Loebner Prize) co-organized by Warwick, after it managed to fool 33 percent of the judges into thinking it was human. However, this result is disputed.

Eugene Goostman’s developers gave the chatbot the personality of a 13-year-old boy from Ukraine, so that it wouldn’t be expected to answer perfectly.

Turing test

Passport photo of Alan Turing at age 16 (Copyright: public domain)
Passport photo of Alan Turing at age 16 (Copyright: public domain)

In 1950, Turing wrote, “I believe that in about fifty years’ time it will be possible, to program computers … to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.” This prediction has been interpreted by some as the threshold of the Turing test.

One of the four judges of this year’s Loebner Prize, Richard Tolcher, said he thinks a computer might be able to truly pass the Turing test in 50 years time.

Sharkey isn’t as optimistic. “I’m not sure if they’ll ever pass it really,” he told WikiTribune. “We don’t seem to be much closer than we were at the beginning.”

The Loebner Prize was founded by American inventor Hugh Loebner in conjunction with the Cambridge Center for Behavioral Studies in Massachusetts. Since 2014 it’s been organized by the Society for the Study of Artificial Intelligence and Simulation of Behaviour (AISB).

The Loebner Prize interpretation of the Turing test has been both highly influential and criticized. The late AI researcher Marvin Minsky called it an “obnoxious and unproductive annual publicity campaign.”

On the other hand, the event attracts public notice for AI developers. That’s not a bad thing,” said Stephen Rainey, one of this year’s judges.

Minksy and Loebner had a history. Sharkey told WikiTribune that Minsky thought Loebner had made the rules of the competition too hard and was “just furious” when the machines were shown up as terrible at the first year’s competition, embarrassing the AI community.

Realizing a machine couldn’t hope to pass the Turing test, the academic committee that had spent two years setting up the Loebner Prize lobbied to relax the competition’s rules. Loebner refused to loosen the rules. From Sharkey’s point of view, he was actually being more fair to the original Turing test.

AISB chair Bertie Müller announced at the end of this year’s Loebner Prize that the competition will be changing in the future. He said the society wants to make it part of a larger tech conference, open it up to big companies, encourage creativity and change the media perception that AI is scary. He also admitted the society needs to find funding after funding promised by the prize’s founder, Hugh Loebner, never materialized following his death.

Worswick said he felt “incredible” after winning the Loebner Prize for the fourth time.

“I’m hoping that the Loebner Prize survives to be able to win it a fifth time,” he said.

Chatbot applications

From left to right: Savva Kuznetsov, developer of chatbot Columbina, Steve Worswick, developer of chatbot Mitsuku, and Will Rayer, developer of chatbot Uberbot. Copyright - CC BY SA 4.0, Author - Harry Ridgewell/Wikitribune
Left to right: Savva Kuznetsov, developer of chatbot Colombina, Steve Worswick, developer of chatbot Mitsuku, and Will Rayer, developer of chatbot Uberbot. Copyright – CC BY SA 4.0, Author – Harry Ridgewell/Wikitribune

The contest and conversations, such as the one above, may seem frivolous. But chatbots can have important functions.

“You’ll definitely see practical applications in Steve Worswick’s Mitsuku,” said Müller. 

Worswick said he monetizes Mitsuku by using it to give “personality flair” to other chatbots, which provide simple services like ordering train tickets or pizza. Tutor, which came second, is used to help teach people English.

According to Venturebeat, in 2018, two years after Facebook Messenger allowed developers to place chatbots on its platform, 300,000 bots had been created for it. Google has announced it’s sold one of its home smart-speaker devices every second since October 2017 (Engadget).

“There’s definitely this big race to become the default voice experience,” Julian Harris, another of this year’s Loebner Prize judges said. “It’s really like the next operating system.”

Loneliness is a growth market, no more evident than in Asia where Harris claims 25 percent of Xiaoice chatbot users have at one point told their device, “I love you.” 

Uberbot developer Will Rayer believes chatbots can help the elderly by talking to them and reminding them to take their medications.

Worswick authored a 2018 Medium article that claimed abusive messages, swearing and sex talk account for about 30 percent of the input Mitsuku receives.

“It would be a real money-making possibility to make a romance robot, or like a sex robot, and something that could play on people’s feelings like that [but] it’s not a route I want to go down,” Worswick told WikiTribune.

Microsoft chatbot Tay was shut down only 16 hours after launch, after it learned from users to express racist and pro-Hitler opinions (The Verge).

Mitsuku is able to learn within conversations it has with users, but what it learns isn’t automatically incorporated into its public knowledge base. Users conversation logs are emailed to Worswick so that he can review what’s worth adding to Mitsuku.

See the transcripts of the conversations between judges, chatbots and human confederates from this year’s Loebner prize here.

  • Share
    Share

Subscribe to our newsletter and be the first to collaborate on our developing articles:

WikiTribune Open menu Close Search Like Back Next Open menu Close menu Play video RSS Feed Share on Facebook Share on Twitter Share on Reddit Follow us on Instagram Follow us on Youtube Connect with us on Linkedin Email us