ChatGPT-4 Crushed the LSAT

Daanish Bhatti
3 min readMay 6, 2023

I assess how ChatGPT-4 performed better than the average LSAT test taker but fell short compared to the nation's top performers.

ChatGPT-4 scored 163 on the LSAT.

The AI model scored just below the top 10% of test takers. the people that get into the top Law schools like Harvard, Yale, and UPENN.

Photo by Sebastian Pichler on Unsplash

The scores were reported in the whitepapers of Open AI — the company that owns ChatGPT.

It is definitely an area where the company could brag about the model’s performance. The average human scores around 151.

I, a human, am currently studying for the LSAT and it’s pretty tough. My diagnostic was exactly the average — a 151.

I thought I would do some research into how GPT-4’s performance to see if I could get any tips on how to improve my score.

What GPT-4 crushes

Geoffrey A. Fowler, a columnist for the Washington Post, fed some logical reasoning questions from the LSAT to GPT-4.

Logical Reasoning questions present short passages with arguments and ask the test taker to analyze, evaluate, and answer with the best answer choice.

Or, to put it more simply by Fowler, they’re essentially multiple-choice brain teasers that tell you a whole bunch of different facts and ask you to sort them out.

GPT-4 got 9 out of 10.

The LSAT is quite good at organizing information, even when it’s presented in a confusing manner, which is what the LSAT is built to test.

Where GPT-4 underperforms

Truthfully, there isn’t a lot of information on the internet about how GPT-4 is able to do so well on the LSAT. Instead, the majority of articles online simply report the score and how it compares to the average human.

Blueprint Prep — a leading test preparation and professional training company for the LSAT, analyzed GPT-4’s ability to perform the LSAT, independently.

Despite 163 being a good score, the company was more critical of the model’s performance.

They stated that GPT4 struggled with the test itself. They found consistent errors with the ability to apply logical and critical reasoning and differentiate between necessary and unnecessary material.

The LSAT includes information that is purposefully confusing that makes it hard for the test taker to answer. This seems to confuse both humans and robots.

Blueprint Prep is a bit biased. They mentioned the importance of how top test takers learn from top tutors and prep companies on the specifics of the test, which allows them to perform well. Although this is true, inserting such a fact is beneficial to Blueprint Prep’s bottom line as a test-prep company.

That’s It

GPT-4 does well on the LSAT because of its ability to organize information and answer a set of questions at a high level.

The model underperforms at the specifics of the test itself. High performers are able to think like the test makers. They achieve this through private tutoring, a prep course, or booklets.

In the future, I’m sure there’s a capability to become an even higher performer. For example, someone could train the model with a set of inputs that makes it think like the optimal test taker.

Time will tell. But, I wouldn’t be surprised if it’s in the next couple of months.

--

--