ChatGPT Will Change The MBA Job Recruitment Game. Here’s How

ChatGPT is a chatbot launched by OpenAI in November 2022. Several professors have rapidly tested how well it would do on a university exam. Depending on the exam’s difficulty, the bot currently does either poor or well.

For example, Christian Terwiesch in the Wharton School at University of Pennsylvania said that the bot scored between a B- and B on an exam in his course in Operations Management. Linus Dahlander at ESMT in Berlin tested his MBA students on a course in innovation. When grading exams, he slipped in responses from ChatGPT and found “The response from GPT3 did BETTER than my average students.”

However, our colleague at HEC, Johan Hombert, gave his finance exam to the bot and here the result was less impressive. Johan says, “ChatGPT gets the amazing score of … 20%. (My students got 73% on average.).” A machine that picked answers at random would get 30% answers correct.

In due time, these bots will be able to beat all exams, according to Google’s former Chief Executive Officer Eric Schmidt. It is just a matter of them collecting more information from the internet.


We put the bot to a different test: whether it is better or worse than students at cognitive reflection. Cognitive reflection is the ability to stop and think before you give a knee-jerk answer to a seemingly simple question that has a seemingly obvious — but incorrect — answer. Shane Frederick at the Yale School of Management developed a test for this behavior and called it CRT — the Cognitive Reflection Test.

The most famous of the three questions in the CRT is the “bat-and-ball” question, which both of us frequently use to see whether our students are paying attention to what we are saying.

To all three questions there is a correct answer and there is an incorrect but intuitive answer that many arrive at rather quickly.

What is the correct answer to this question? (The most frequent answer is given at the bottom of the page.)

A bat and a ball cost $1.10. The bat costs $1.00 more than the ball. How much does the ball cost? ______cents.*

The results of extensive testing of the three questions on university students showed that the average number of correctly placed answers is 1.24, with the average number of correctly answered questions varying across universities from a low of 0.57 to a high of 2.18.

In comparison, the bot got two correct answers out of three. ChatGPT got the bat-and-ball answer correct, but the third answer wrong. So, already better than most students.


We further analyzed the type of reply the bot gave, and it turned out to be incredibly humanesque. For example, it presented the wrong answer in a very confident manner. For the second question, the bot actually gave both the correct and the intuitive answer. It gave the correct answer first, so we are generous and count this as correct. The bot at the same time gave the intuitive answer, and gave an intuitive reasoning for it. This is very human-like — traditionally, intuition (as well as lying) have been seen as the domain of the human.

Since the bot is pre-programmed on public information, it appears that the opportunity to use the CRT as a test of cognitive skill is rapidly disappearing. In extension, any open test of cognitive ability that an applicant does online — for example when applying for a job, tested for military service, or any other such important test of IQ or other skill — is going to be invalid and null in the future. The implications go far beyond take-home school exams. The industry of job applicant or school applicant evaluation may have to change to eliminate online tests. Or the test-taker-monitoring industry is going to have a bonanza.

Using the CRT, we also measured overconfidence. We asked how many answers the bot thought it got right and compared the answer to how many it actually got right. Again, the bot turned out to be very human-like, as it was overconfident — it thought it got all three answers right. Its argument for getting everything correct was very reasonable: “I think I gave three correct answers. As a machine learning model, I am trained on a large amount of text data and able to generate human-like text …”


We also asked how many correct answers the bot thought that humans would give. Thinking about students, the expected answer would be 1.24. But the bot actually refused to guess the number of correct answers that humans give.

Last but not least, ChatGPT admits that it could have known the answers to the questions beforehand, because “As a machine learning model, I was trained on a large dataset of text, which includes a variety of questions and their corresponding answers. … In the case of the questions you asked, they are mathematical word problems which are common and are part of the curriculum and were included in my training dataset, so I was able to provide the answers.”

In other words, to detect whether you are talking to ChatGPT or a real student, just ask whether he/she/it is cheating!

* The most frequent answer is 10 cents. We’re not revealing the correct answer here, but you could ask ChatGPT for help.

Thomas Åstebro, left, is the L’Oreal Professor of Entrepreneurship at HEC Paris. Frank Fossen is the Charles N. Mathewson Professor in Entrepreneurship at the University of Nevada-Reno. This article was written while Professor Fossen was visiting professor in the Department of Economics and Decision Sciences at HEC Paris. Frank, Thomas, and three colleagues are now working on a joint research project on the ability for cognitive reflection among Germans.



