Here are a few stray details I picked up from my participation in some recent ETS events. They aren’t really important, but you might find them interesting.
The SpeechRater checks everything, including vocabulary and grammar. My earlier impression that it only checks delivery was incorrect.
Confirmed: there is just one human rater alongside the SpeechRater for each task.
The human score and the SpeechRater score have equal weight. They are averaged. If there is a major difference between them, however, a second human rater will check the answer.
SpeechRater generates task-specific scores, rather than scoring everything collectively.
Important: When a score review is requested the SpeechRater is not used.
Interestingly, an ETS person told me that in the past there was only ever a single human rater for each task. This is the total opposite of what ETS told me previously, which is that there was always two human raters for each task. I guess it doesn’t matter now.
They have heard your complaint about the less-detailed Score Reports, and are considering how to provide more detailed information. Yay!
They are working on a new Official Guide, but there is no timeline for publication.
I’ll have to update my video about the Speaking Section changes, I think. Stay tuned.
To get a better sense of the distribution of questions on the new version of the TOEFL, I have compared the new versions of the TOEFL Reading Practice Sets released by ETS to their old versions. Note that the three sets in the above link are modified versions of the old TPO 7 and 8 sets. The articles are the same, but certain questions have been removed. Here’s what I found out about the question types on the new version of the test.
Old Set 1
Old Set 2
Old Set 3
New Set 1
New Set 2
New Set 3
Factual Information
4
4
3
3 (-1)
3 (-1)
3
Negative Factual Information
2
2
1
1 (-1)
1 (-1)
1
Rhetorical Purpose
1
1
2
1
1
2
Vocabulary
4
4
3
2 (-2)
2 (-2)
1 (-2)
Sentence Simplification
1
1
1
1
1
1
Insert a Sentence
1
1
1
1
1
1
Inference
0
0
1
0
0
0 (-1)
Summary
1
1
1
1
1
1
This confirms my earlier reports that the new test has far fewer vocabulary questions. Factual and Negative Factual questions have also been reduced, it would seem.
This also confirms that Reference and “Fill in a Table” questions will probably not appear on the test much nowadays, as they are totally absent from the practice materials. Note that even though the single Inference question has been removed from the test, it is still being used quite frequently, according to reports.
Next up, I’ve done the same analysis of the Free Practice Test provided by ETS. The results are as follows.
Old Set 1
Old Set 2
Old Set 3
New Set 1
New Set 2
New Set 3
Factual Information
3
1
4
2 (-1)
1
3 (-1)
Negative Factual Information
2
2
1
2
2
2
Rhetorical Purpose
0
1
1
0
1
1
Vocabulary
4
4
3
2 (-2)
2 (-2)
1 (-2)
Sentence Simplification
1
1
1
1
0 (-1)
0 (-1)
Insert a Sentence
1
1
1
1
1
1
Function of Paragraph
1
0
0
0 (-1)
0
0
Inference
1
3
1
1
2 (-1)
1
Summary
1
1
1
1
1
1
Again, we can see that there are far fewer vocabulary questions. But we can also see that all of the question types are affected, except for the Insert Sentence and Summary types.
The odd “function of paragraph” entry refers to a non-standard question that isn’t mentioned in the Official Guide or any other ETS resources. On the original set it was phrased as “What function does paragraph 3 serve in the organization of the passage as a whole?”. I guess this is sort of like a rhetorical purpose question, but it really surprises students when it comes up. Note that although it has been removed from the practice test, I have had reports that it has appeared on the real test since August 1.
According to reports, TOEFL score reviews (that is, re-scoring) are now much faster than before. I haven’t gotten confirmation, but according to ETS customer service, score reviews are now finished in 24 to 72 hours. In the past, these took up to ten days, just like a regular score report.
In fact, one student has told me that her writing section score review took just ten hours to complete this week.
This is interesting. Indeed, considering the changes to the speaking section’s length and scoring process, it likely does not take as long as before to grade the test.
I wonder if this is foretelling faster TOEFL score reports in general. Now that it is possible to take the test every single week, many students would appreciate getting their scores in just three days. I have heard nothing about this, however.
At the 2019 TOEFL iBT Seminar in Seoul on September 5, ETS announced details of the new “Enhanced Speaking Scoring” for the TOEFL, which has actually been in place since August 1, 2019.
In the past, speaking responses were graded by two human graders. Now, however, speaking responses are graded by one human grader along with the SpeechRater software. This software is a sort of AI that can evaluate human speech, and has been used by ETS for various tasks since about 2008. Most notably, it provided score estimates for the “TOEFL Practice Online” tests they sell to students.
According to ETS:
“From August 1, 2019, all TOEFL iBT Speaking responses are rated by both a human rater and the SpeechRater scoring engine.”
They also note:
“Human raters evaluate content, meaning, and language in a holistic manner. Automated scoring by the SpeechRater service evaluates linguistic features in an analytic manner.”
To elaborate (and this is not a quote), ETS indicated than the human scorer will check for meaning, content and language use, while the SpeechRater will check pronunciation, accent and intonation.
It is presently unknown how the human and computer scores will be combined to create a single overall score, but looking at the speaking rubric could provide a few hints. Note that in the past the human raters would assess three categories of equal weight: delivery, language use, and topic development. If the above information is accurate, the SpeechRater now assesses delivery, while the human now assess language use and topic development. It is possible, then, that the SpeechRater provides 1/3 of the score, and than the human rater provides the other 2/3.
I will provide more information as I get it. In the meantime, check out the following video for more news and speculation.
The TOEFL iBT Free Practice Test seems to be the same as Quick Prep Volumes 3 and 4, but modified to match the new version of the test. The second speaking question, though, is new. This is probably because the Quick Prep version referred to students using a “Walkman” in the cafeteria. That’s a pretty old reference!
The iBT Practice Sets include SOME of the content from the TOEFL Quick Prep volumes 1 and 2. Like the Quick Prep sets, they include no audio tracks… you can merely read transcripts of the spoken parts.
The New PDFs are a combination of stuff from the Quick Preps, the TOEFL edX class and the old PDFs. Of course there are no audio files.
It is great that ETS has provided some updated materials, but is is disappointing that the free test is a less accurate simulation of the test center experience than the old TOEFL Sampler program. There are no timers in the listening and reading sections, and in the speaking section a sample answer is played before students even get a chance to deliver their OWN response.
Well, I took three of the writing simulations offered by Edusynch, and they were all terrible.
None of them followed the structure used by ETS. One of them was, ostensibly, a “supporting type” question which is a style that hasn’t appeared on the TOEFL since 2005.
If you are reading this, People of Edusynch, take a look at the following graphic:
That is what an integrated writing question is supposed to look like. Take a look at the left-hand side. The reading always has four paragraphs. The first paragraph states the main argument of the reading. After that, there are three body paragraphs, and each one of them presents one point in support of the main argument.
Now take a look at the lecture. Of course a lecture can’t have paragraphs… but if you were to type out a typical TOEFL integrated question you would see that it starts with an introduction, and that one at a time it specifically challenges each of the points from the reading. The lecture actually mirrors the reading so much that it challenges the points in the exact same order as they are presented in the reading!
The three samples I bought from Edusynch didn’t do this. Two of them had only three paragraphs, none of them had point-counterpoint matching structures. Can you believe that one of them had only TWO paragraphs in the reading?
Guys, you are charging $12.50 a pop for these. You can do better You’ve taken the test. You know these aren’t accurate. Pay someone to fix them.