18 October 2024

Did the Wordlebot Make a Mistake? (Wordle No. 1214, 15 Oct. 2024)

No one needs me to explain what the Wordle is; I don't remember exactly when I picked it up, but I've been doing it since around the time it took off. (Unfortunately, I lost my stats in summer 2022 when my phone stopped working and I didn't have a New York Times account yet, so I only have a record of my last 833 games.)

And like many Wordle aficionados, I'm sure, I like to check out my scores against the Wordlebot. The Wordlebot, of course, plays with a 99 skill score, meaning it's in the 99th percentile for quality of guess. But when I went through the Wordlebot analysis of Wordle no. 1214, from Tuesday, October 15, 2024, I decided that its analysis of my guesses was wrong.

Here's my claim. Spoilers, of course, for a past Wordle.

So my initial guess was what has largely been my initial guess,* IRATE:

 

As you can see, the Wordlebot gives this a 94 skill score, though your opening word choice doesn't factor into your overall skill score.

Unless I get a whole lot of information from my first guess, I usually use LOCUS as my second:


It gives me three key consonants (L, C, and S) and the remaining two vowels. It's not always a great guess, but in this case it was; I got a 96 skill score off it and narrowed things down to just five words. 

With this information, my third guess was CODER:

 

Wordlebot gave me a 90 for this; I'm not very sure why COVER is considered the superior pick.

Anyway, it was clear to me that there were three options left, just as you can see in the above screenshot: CORER, COVER, and COWER. Probably, anyway. Sometimes I systematically go through all the possible letters but still miss an option. And I had three guesses left. I could of course just make each guess in turn, but 1) that meant a decent chance of scoring 6 guesses overall, and 2) what if I had indeed missed something and the answer was some other word I hadn't noticed?

So I decided to be a bit strategic and guess VOWED. This would eliminate COVER and COWER if they were the answer, leaving me free to guess CORER on my fifth guess, and thus giving me the space to guess one more word if it somehow wasn't the right answer.

As you can see, Wordlebot says, "Nice," but actually gives it a 51 skill score. It would have picked COVER. But had it picked COVER, it would have then picked, presumably, COWER, and then CORER, meaning with the information I had, it would have have taken the Wordlebot 6 guesses, whereas it took me 5.

Why is this? While the Wordlebot said CORER, COVER, and COWER were all possible, it actually didn't think CORER very probable. See this list:

You'll see it gives COVER a 99 skill score, COWER a 96 skill score... and CORER a 6 skill score! Ouch. And if you scroll down to the "Comparing our guesses" frame, you can see it assigns a very low probability score to CORER:

COVER is 51%, COWER is 48%... and CORER is 1%! 

Why is this? The Wordlebot's understanding of what makes for a probable word is derived from the NYT's own corpus. As per their explanation, "The bot's probabilities are estimated based on how common a word is, using the frequency of appearances in The Times since 2000 as a rough and imperfect proxy. This means the bot, just like humans, has to guess whether borderline words will wind up on the new (hidden) Wordle solution list." 

I do have to admit that CORER seemed somewhat less likely to COVER or COWER to me, but that COVER was fifty times as likely is not a ratio I would have come up with. But the NYT is a newspaper, and I guess newspapers probably talk about people covering and cowering a lot more than they talk about people using apple corers. But in day-to-day existence, while "corer" probably isn't used as much as the other two words, I don't think I would assign such a lopsided ratio.†

So while I can see why Wordlebot went with COVER, as it basically had a 50% chance of getting the solution in 4 according to its own metrics, it would have actually got it in 6, whereas I guaranteed getting it in 5 with my strategy. So I think I deserve a better skill score than 51 on VOWED, and a better skill score than 84 on the puzzle overall, neener neener.

* Once IRATE was the actual answer, I switched to a strategy I saw on the Wordle subreddit for switching things up, which was to use the previous day's answer as the next day's starting word. This was fun if tricky, and taught me to be a bit more strategic about my guesses. But as I neared exceeding a particularly long streak, I switched back to using IRATE in order to make things easier on myself.

† Interestingly, in the Google Books Ngram corpus for 2022, "cover" appears 150 times as often as "cower," and "cower" appears 10 times as often as "corer."

No comments:

Post a Comment