Last month I published a list of things I would change about the TOEFL to make it better. Since I am neither an assessment expert nor an linguist, the list focused mainly on everything except the content of the actual test. As I promised at that time, I sent the list to a few other teachers and asked what they would change about the test. Today I’m happy to share those comments.  At the end of today’s blog post you’ll find a few more of my own ideas.

Kathy Spratt, author of Mastering the Reading Section for the TOEFL iBT suggested:

  • Be more vigilant about the quality of test centers. I have seen test centers in which the seats were separated by a sheet of cardboard, which is hardly soundproof.
  • Have some appointments available in the evening. Many students work during the day and would benefit from a test that starts at six or seven in the evening.

Jane Birkenhead of Birkenhead English feels a lot like I do!  She suggested:

  • Increase the number of full ‘paid for’ practice tests. And update them frequently. It’s ridiculous that there are only 4 and the same ones have been there for years. 
  • Make the ETS TOEFL website easier to navigate. There’s no logic to it. There’s actually some decent free practice advice on there but hardly anyone can find it.
  • Rewrite the scoring rubrics using language that TOEFL students can actually understand. They take some unraveling right now.
  • Put the SpeechRater software on the ETS website so students (and teachers!) can practice. They’ve made it available on the EdAgree website but it’s a great tool for all TOEFL students so why hide it away on some obscure website that no one knows about.
  • Stop scoring speaking responses out of 4 and essays out of 5. Either score everything out of 30 and take the average or do speaking responses out of 7.5 and essays out of 15 then we can do away with the mystery of score conversion charts.

I wish I was as thoughtful as Jane.  But a few more ideas do come to me now:

  • Bring back the detailed score reports.  Students used to get separate “levels” for each writing task, and for each pair of speaking tasks.  That removed a bit of the mystery from the reports.  Now students just get a single overall writing level and a single overall speaking level.  Many students are puzzled about which specific questions are bringing down their scores.
  • For the home edition of the test, remove the instructions that say to “put on the headset.”  Because, of course, if students do that their test will be immediately cancelled. Surely this would take just 30 seconds of work at ETS headquarters.
  • Staff up the Office of Testing Integrity. Get those tests confirmed as fast as possible!

Well, I took on some outside work this month and didn’t have time for anything on July’s to-do list, but I always have time for the least popular part of this blog – the monthly “you should read more” article!

MgazinesThis month I read the April 24 issue of “Science News.”  As always, the magazine contained a ton of great articles that resemble the various reading (and listening) tasks that appear on the TOEFL.  There were a few standouts this month:

Next I read the July Issue of “History Today.”  Articles about history are really common in the reading section of the test… and not just articles about “early” human history.  Most of the content from this  magazine is behind a paywall, but a few great articles are available online:

  • China’s First International Students discusses a group of young Chinese children sent to study abroad in 1872.  It’s a fascinating story. They were pulled back by the regime earlier than planned, but many of them played important roles in the development of the country upon their return.
  • Baby Boom or Bust compares today’s low birth rates to the history of France from the 19th to mid 20th centuries.

I also read the May issue of National Geographic.  This was the best issue of NatGeo in a long time.  Here’s what caught my eye:

  • The Conservation Popularity Contest could form the basis of a type 1 writing question.  I imagine a reading about the problem of ugly endangered species being ignored, and the lecture suggesting solutions to this problem.
  • There is a tiny little space-filler about the hummingbird being a “surrogate species.”  That would make a perfect type 3 speaking question!  I can’t find a link to the little article online, but here is a little article from the US Fish and Wildlife Service.
  • One of the long feature articles this month is about saving coral reefs. That could certainly form the basis of a problem/solution writing question as well.

Finally, I read the Summer 2021 issue of Modern Cat Magazine. You had better believe it. I liked:

  • The Evolution of the Social Feline.  I think I will submit my foster cat for Modern Cat’s “Cat of the Week” award.  I hope you’ll all vote for it if I post a link here.

Nellie BlyI also read some books that aren’t worth mentioning here, but I will mention the Penguin Classics collection of journalist Nellie Bly’s work.  It’s titled “Around the World in Seventy-Two Days and Other Writings.”  Bly was a pioneering New York journalist in the late 19th and early 20th century.  She was noted for her “stunt reporting” including how she got herself committed to a mental hospital in 1887 to secretly investigate the conditions there, and her recording-breaking around the world trip in 1890.

That’s all for now.  More recommendations next month.

Updated: March 5, 2023

I’ve been getting a lot of reports about TOEFL scores being put “on hold” lately.  These reports are mostly from students who took the TOEFL iBT Home Edition, but sometimes it affects people who took the test at a test center.

When this happens, your TOEFL account says something like “Tested – Scores on Hold.”  This is also called “administrative review.”

This usually happens because ProctorU or Wheebox or ETS detected something abnormal, and the test needs to be reviewed by an expert.  They might think that you did something inappropriate.  Or there might have been a technical problem during the test.  Usually the review process takes 2-4 weeks. After that your scores are reported or your scores are cancelled.

At the beginning of the process, ETS usually sends an email to the test-taker that tells them to wait 2-4 weeks.

To talk to someone at ETS you should contact the TOEFL Office of Testing Security.  You can call  them at the following numbers:

  • 1-800-750-6991 (in the USA and Canada)
  • +1-609-406-5430 (all other locations)

They will answer the phone from 7:30 AM to 5:30 PM Eastern Standard Time, Monday through Friday. Usually they will tell you to wait ten days and call back again, but sometimes calling speeds up the review.

You can also email them, but that might take longer. Their email address is:  or maybe:

You can also contact them using this form.

I do not recommend using the regular TOEFL customer support phone number for this problem.

Update:  Here’s a copy of the email that ETS sends when this happens.

Dear XX,

At ETS, we are highly committed to the quality and fairness of our tests. We go to great lengths to make sure that every score is accurate and valid. As part of this process, sometimes we take additional quality control steps before scores are released.

For these reasons, your TOEFL scores from the XX/XX/20XX test administration are delayed because they are under administrative review. Most of these routine reviews are completed in 2-4 weeks. In rare cases, the review may take longer. These reviews are necessary to ensure that the results are accurate and valid.

At the conclusion of the review, you will be notified of the status of your scores.  If they are released, your scores will be reported to you and to any institutions or agencies you have designated to receive them.

If you have not been notified after four weeks, you can call to inquire about your scores at 1-609-406-5430 or 1-800-750-6991, 7:30 a.m – 5:30 p.m. U.S Eastern time, Monday through Friday, or email us at

Office of Testing Security
Rosedale Road
Princeton, NJ 08541

Remember:  I’m not an employee of ETS. I’m just a guy on the Internet.



A few weeks ago I wrote about a contest EdAgree is running, where you can win a voucher to take the TOEFL test for free. I’m happy to report now that the contest has been improved.  Starting right away, winners will be able to pick from a TOEFL voucher, an IELTS voucher or a Duolingo English Test voucher!

Enter the contest here.

Winners will be selected in the first week of every month this year, starting in August.  You just need to submit your email address and a few details about your educational background to sign up.  All of my personal students have entered the contest already.

EdAgree is a unique new organization.  It is wholly owned by ETS, but operates independently.  EdAgree’s goal is to provide support for international students  from their initial decision to study abroad right through to the end of their studies.  They provide this support in a variety of ways like providing 1:1 counselling, reviewing application documents, helping with the visa process, and so on.  It’s a really neat concept.

I’m going to write a bit more about some of their tools (I’ve already touched on the SpeechRater tool they offer) in a few days.  In the meantime, take a moment to check them out for yourself and enter the contest.

Here’s an interesting story. British Council has sold its stake in IELTS in India to IDP. As you probably know, IELTS is a partnership between British Council (non-profit), IDP (for-profit) and Cambridge University.  British Council  netted 130 million GBP (180 million USD).

Some people are unhappy.  Others, presumably, are quite happy.

Note that British Council has sold only the operations in India.  I guess it still retains its stake in the test outside of India.


I often hear things like this:

I’ve been speaking English for twenty years, and I only got a 78 on the TOEFL.  The test is bullsh–t.

And things like this:

My crazy uncle Bob is a native speaker, and he only got a 22 in the reading section.  The test is obviously unfair.

But here’s the thing.  The TOEFL is not just an English test.  I know, “TOEFL” is supposed to stand for “Test of English as a Foreign Language.”  But check the ETS website.  They don’t use that name anymore.  You won’t find it in the Official Guide anymore.  These days, TOEFL doesn’t stand for anything.  It’s just the “TOEFL Test.” 

Even though Uncle Bob is a native speaker, it is likely that he’s incapable of functioning in an academic environment.  Indeed, most native speakers aren’t.  I suspect if you pull 100 random people off the street in the USA and give them the same TOEFL reading set, their average score will be quite low.

Keep that in mind before you get frustrated by your TOEFL score.  The score is meant to predict your performance in an English-medium university.  That’s it.  It isn’t a test of how well you communicate in English in general.  

This is what test designers refer to as “validity.”  They argue that you can’t just toss a bunch of grammar and vocabulary questions on a test (like the original TOEFL) and expect the scores to be useful for any real purpose.  The questions need to have a connection to the users of test scores, and be valid for their intended purpose.

In 2008 ETS published a 370 book to make this case (“Building a Validity Argument for the Test of English as a Foreign Language,” Carol A Chapelle, et al).  Ask your TOEFL teacher if they have a copy.

In a more recent book, Chapelle says:

Test developers, researchers, and anyone responsible for assessing human capacities would readily agree that validity is their central concern.  Similarly, teachers, employers, students, parents and researchers want the tests they use to be valid, and they expect professionals in educational and psychological testing to know how to evaluate a test’s validity. (“Argument- Based Testing in Validation and Assessment,” Carol A Chapelle)

This is why the TOEFL is really hard, and this is why the TOEFL is really popular with university administrators.

And this sort of thing is why there are a whole bunch of different tests.  The IELTS General Test is intended to assess language skills needed by immigrants and people in non-academic training.  The TOEIC test is meant to assess the potential of people to function in business settings.

Heck, ETS seems to be all-in on this concept.  They recently purchased Pipplet, which makes boutique language tests.  Including tests for: customer service agents, consultants, retail workers, medical professionals, startup employees… and more!  Some day they’ll have a different test for everyone!

What Does This Mean for Students?

It means that you need to prepare for the TOEFL.  Don’t just count on your amazing English skills. You can’t just prepare for a couple of days and expect an amazing score.  You need to study. 

The Future

Some people think this is an old-fashioned idea.  Maybe they are right. Maybe language tests won’t be so hung up on this conception of validity in the future.

In fact, according to the IPO documents recently published by Duolingo, that company wants its “Duolingo English Test” to be used for university admission, immigration AND workforce placement!  The idea of an identical test for all three purposes seems antithetical to the above definition of validity.  This sort of idea is why the Duolingo folks seem to be able to get a rise out of ETS in a way that the IELTS and Pearson people cannot.

A couple of years ago I recommended that all of my teacher friends invest in the Duolingo IPO, as I thought the company’s little-known English test would hit the mainstream in four or five years.  Sadly, the cat is out of the bag on that front, and I am not sure the company is a great buy.  

Anyway, there are a few fun details about the Duolingo English Test to be found in their IPO documents, filed with the SEC just a few days ago.  Here’s what caught my eye:

  • In 2020, the test was taken about 344,000 times for about 15 million dollars in revenue. 
  • In 2020, 10% of the company’s revenue came from the test.  That reached 11 for the beginning of 2021.
  • The company hopes to extend the test to the immigration and workforce testing sectors.
  • Duolingo, as a company, lost 15.8 million dollars in 2020.  I can only imagine how frustrated ETS feels having to compete with a company that doesn’t really need to make money.
  • Duolingo expects schools to continue accepting the test after the pandemic ends.

Well, I reported a few days ago  on the impressive increase to the mean TOEFL score found in the data released by ETS.  I expressed some puzzlement at the increase, as it is pretty huge.  I’m still not entirely certain why it happened, but after talking it out with some experts, my conclusions are:

  1. The change is mostly due to the shorter test.  I guess the shorter version is “easier.” While the reported mean score did not change in 2019, that was partly because of rounding, and a drop in the mean writing score. If we look carefully there were fractional increases in 2019 which hint at a trend.
  2. ETS may have adjusted the e-rater which scores essays. That’s a normal thing. I think they are on iteration 19 or something like that. I suspect that caused writing scores to increase. That makes up 25% of the overall increase… but the shorter test should have no effect on it.  Perhaps they wanted to address the long-term drop in average writing scores.
  3. The increase is caused in large part by China (presumably the number one TOEFL market) and Korea (presumably the number two TOEFL market). Increases in the mean score probably reflect advances in preparation techniques in those countries. Coincidentally I spent the month before the score data release reporting on those advances.

Let me know what you think in the comments.

TOEFL Score data for 2020 is available!  As regular readers of the blog will know, this is my favorite day of the year!  You can download your copy from ETS.

Scores are way up this year.  I don’t know why.

The overall mean (average) score is now 87.  That is an increase of four points, which is quite a big jump.  Here’s the history of the average TOEFL score:

  • 2006: 79
  • 2007: 78
  • 2008: 79
  • 2009: 79
  • 2010: 80
  • 2011 (not available)
  • 2012 (not available)
  • 2013: 81
  • 2014: 80
  • 2015: 81
  • 2016: 82
  • 2017: 82
  • 2018: 83 
  • 2019: 83
  • 2020: 87

As you can see, it took thirteen years for the average score to increase from 79 to 83.  That jump was replicated in 2020 alone.

Obviously this year there are also large jumps in the section scores:

  • The mean reading score is now 22.2 (+1.0)
  • The mean listening score is now 22.3 (+1.4)
  • The mean speaking score is now 21.2 (+.6)
  • The mean writing score is now 21.5 (+1.0)

Last year, the section score changes were much smaller. They were (respectively): +.4, +.3, +.1, -.2.

The jumps in 2020 alone are comparable to the jumps I recorded in the nine years from 2010 to 2019.

As you guys know, I like to study geographic trends, particularly those in China, Korea and Japan.  Here’s what I spotted:

  • The mean score in Korea is now 86 (+3)
  • The mean score in China is now 87 (+6) !!!
  • The mean score in Japan is now 73 (+1)
  • The mean score in Taiwan is now 85 (+2)

I must point out that in the thirteen years between 2006 and 2019 the average score in China increased by five points.  In 2020 alone the increase was six points.

Scores in other key markets have increased as well:

  • The mean score in Brazil is now 90 (+3)
  • The mean score in India is now 96 (+1)
  • The mean score in the United States is now 93 (+2)

The top performing country this year is Austria, with an average score of 102 (+2)

It appears that China is driving much of the overall increase.  In case you are curious, the section increases there are: Reading + 2, Listening + 2,  Speaking no change, Writing +2.

I’m going to do some more digging and some more calling in the weeks ahead.  I want to know more about these dramatic changes.

