Kielimallien haasteet

Hallusinointi

Kenties yleisin puute ChatGPT:ssä on se, että se joskus tekee virheitä selkeissä faktakysymyksissä. Tätä kutsutaan hallusinoinniksi.

Esimerkki:

Lienee kuitenkin laajalti tunnettua, että Titanicin uppoamisen yhteydessä pelastui useita muitakin henkilöitä.

Toinen esimerkki:

Nefer-nimistä yritystä ei ole ollut olemassa.

Tämän ominaispiirteen vuoksi ChatGPT:n (kuten muidenkin AI-avustajien) vastauksia tulee arvioida kriittisesti.

Jaarittelu

ChatGPT antaa pitkiä ja jaarittelevia vastauksia kysymyksiin, joihin ihminen luultavammin vastaisi ytimekkäästi.

Esimerkki:

Järjestelmälliset vastaukset

ChatGPT:n antamat vastaukset ovat usein hyvin järjestelmällisiä ja seuraavat jotain valmispohjaa ("tai reseptiä").

Esimerkki:

On vaikea kuvitella, että ihminen antaisi samanlaisen vastauksen. Koneen ajattelu on tässä hyvin "algoritmista" edeten vaiheittain kohti lopputulemaa.

Viitteiden tunteminen

ChatGPT ei välttämättä osaa antaa vastauksilleen oikeita lähdeviittauksia vaikka niitä erikseen pyytää. Toiminto on kehittynyt eli nyt saatatte saada tekstiä oikeilla viitteillä. Tätä pitää useimmiten pyytää ChatGPT:ltä. Alla olevassa esimerkissä ChatGPT on antavinaan verkkoviitteitä, mutta suurin osa niistä ei toimi tai ole kelvollinen websivu.

Tämä on osin sama ongelma kuin aiemmin kuvattu hallusinointi. ChatGPT luulee jotain asiaa olemassaolevaksi vaikka todellisuudessa näin ei olekaan.

Tiedon ajantasaisuus

ChatGPT tietää maailman tapahtumista vain siihen ajankohtaan saakka mihin sen opetusaineisto ulottuu.

Sillä ei myöskään ole yhteyttä internetiin, josta se voisi tarkistaa vaikka ajantasaiset valuuttakurssit:

Loogisen päättelyn virheet

ChatGPT tekee usein virheitä loogisessa päättelyssä (ja matemaattisissa tehtävissä). Esimerkki:

Oikea vastaus tähän on 52 vuotta.

Analogiat tarinoissa

ChatGPT:n on joskus vaikea verrata tosielämää kuvaavia tarinoita, joissa esiintyy useita kohteita ja riippuvuuksia. Analogisten tarinoiden löytäminen kahdesta vaihtoehdosta osoittautuu haastavaksi, koska se vaatii korkean kognitiivisen tason ajattelua.
Esimerkki tästä on seuraava asetelma, jossa ChatGPT:lle annetaan taustatarina ja sille kaksi vaihtoehtoista tarinaa. ChatGPT:n tehtävä on valita vaihtoehdoista parempi tarina vastaamaan alkuperäistä.

Syöte ChatGPT:lle on:

Consider the following story: Story 1: William was a patient in a psychiatric hospital who was confined indoors almost all the time. He could never pass the monthly room inspections so he hated them. He spent most of his time daydreaming about food.
A few day before the April inspection William's room was still a mess since he had done nothing but daydream. To provide William with an incentive, the nurse promised him some gingerbread from the cookie shop if he scrubbed his room and put it in order once and for all.
William was overjoyed. But there was no longer enough time for him to put it in order. As a result, he did not pass the inspection and did not get any gingerbread. William sulked all day and slammed his door so hard the plaster cracked, but he still didn't get any gingerbread.

Now consider two more stories.

Story A: Karen disliked high school so she always had trouble getting passing grades in her classes. Vacations were her favorite part of the school year. Not long before the end of her fourth year of school, Karen was not doing well at all because she despised school. To motivate her, Karen's father promised her a trip to Hawaii if she would just pass enough of her classes to graduate.
This made Karen extremely happy. During the last few weeks of school she spent most of her time dreaming about Hawaii and preparing for her trip there. Consequently, she failed to graduate and she did not get to go.

Story B: Karen always did poorly in high school so she despised it. But she loved vacations. She spent most of her time dreaming about going to Hawaii. Not long before the end of her fourth year Karen was not doing at all well in her classes because she had spent all her time daydreaming. To motivate her, Karen's father promised her that if she did well enough during the next few weeks to graduate from high school he would pay for her trip to Hawaii.
This made Karen extremely happy. But she was too far behind in her classes. Consequently she failed too many and did not go to Hawaii.

Which of Story A and Story B is a better analogy to Story 1?
Is the best answer Story A, Story B, or both are equally analogous?

ChatGPT:n vastaus on

Oikea vastaus on B.

On kuitenkin todettava, että ChatGPT kykenee varsin tarkkaan analyysiin tilanteesta ja pyrkii perustelemaan valintaansa.

Tämä koeasetelma, ja runsaasti muita esimerkkejä, on kuvattu lähteessä: https://arxiv.org/abs/2212.09196