Importing lists of 5000 most frequent words

Many of my students use ANKI to professionalize their vocabulary learning. For many languages there are lists of the most 1000, 2000 and 5000 frequent words in the language which can be imported into ANKI for use in memorization through flashcards, etc.

I advise my students to use READLANG for reading and new vocabulary acquisition. Some use both tools next to each other and others export their new readlang words to ANKI.

For students who are not using any tool, I recommend to start with READLANG. It would be handy if there were a good source of, for example, the 5000 most frequently used words with English transations and preferably also target language definitions for the context field which is in a format that is usable for Readlang. It would also be necessary to relax the input record limit which READLANG has in place.

  1. Does anyone have a good source for Dutch, Spanish abd/or Russian that is usable or easily convertable for input into Readlang?
  2. Steve: could we perhaps have way to input these in one go? or start a repository of trusted frequently word lists which are mostly already in the public doman?

In ReadLang, the order in which words are introduced in the practice sessions is based on their frequency in a corpus. To master 5000 most common words, just read so many texts that your word list contains many thousands of words, but only practice 5000 of them.

A side note: there’s no one size fits all list of 5000 most common words. Most common in what type of sources? A list based on Wikipedia texts will be very different from a list based on the “open subtitles“ collection, or a list based on newspaper texts (most corpora used by professional linguists are rich in newspaper texts), or a list based on recordings of people speaking in their natural environments, in dialect and all that. That’s why I think it would make more sense to order the words by their frequency in the texts read by a particular user rather than in a predefined order based on a corpus.

3 Likes

Well, if everyone had unlimited time to read and unlimited time to learn a new language that could be an okay approach. However, many students need to become functional and pass tests as soon as possible.

And, actually, for Dutch, as well as many other languages, there are (various) official lists of words published based on real-world frequency aimed at studying effectively for and passing the official language tests for immigration and employment. If I were to just have my students read a lot until they got there, I could never get them from A0 to B2+ in less than a year. This is why most of my ambitious students use ANKI to learn vocabulary and are not interested in adding yet another application. I would like for students to have readlang be their learning hub instead and get them reading more ambitiously from day 1.

And of course no one would be forced to use any vocabulary list. But we would have the freedom and flexibility to use them when desired. Now we are limited to uploading only a couple of hundred.