Click on any or all of them and check them out.
616,523 words. :
210,653 words. :
479,829 words. :
444,745 words. :
296,809 words. :
199,740 words.
645,289 words. :
165,946 words. :
236,984 words. :
234,937 words. :
127,142 words. :
349,900 words.
191,625 words. :
230,534 words. :
121,807 words. :
354,936 words. :
380,645 words. :
178,385 words.
596,520 words. :
212,710 words. :
134,175 words. :
149,472 words. :
250,353 words.
These 23 lists of words are not your usual lists of words found all over the internet. These lists do not
normally come up in search engine search queries. "aaron aardvark aardwolf abbey about apt app". Cut
and paste the seven words of that last line into a search engine (Google or Yahoo, not Bing) and go to several
of the websites found and listed. You'll discover the lists above and more. You'll be totally amazed!
These 23 lists of 6.8 million total words have been processed and merged by a bevy of custom programs to
throw out duplicates and clean up the remaining entries.
Our goal is to gather as many words as possible from any and all sources. Then compile a clean list, having
removed illegitimate entries. Legitimate words are normal, proper words found in reliable dictionaries.
Clean up criteria includes: no digits "0 to 9", no special characters (,./!@# etc.) and no spaces.
The 26 lower case letters of the American alphabet only "abcdefghijklmnopqrstuvwxyz" are allowed.
All capital letters are changed to lower case letters. No entry can have 3 or more of the same
letter in a row. Sleeep and zzzz are out. Every entry MUST have at least 1 of these six
letters: a, e, i, o, u or y. This eliminates other non-words, mostly acronyms and abbreviations.
If you want a large word list file, relatively clean - this is it. It's still not perfect, it has misspelled
words and so called 4-letter words, but it is cleaner than most of the 23 files above.
At 6.8MB it may take several seconds to completely load: 643,987 word file.
You can do anything you want to with it! It's a flat ASCII plain-text file with cr-lf characters at the
end of each record. That makes it a plain vanilla Windows type of file.
SOME STATISTICS OF THE 643,987 WORD FILE:
---------------------+--------------------------------+----------------------
Length of Words | # of Words of Starting Ltr | Forty
# # | Start # | Common #
Ltrs Words | Ltr Words Percent | Suffixes Words
---------------------+--------------------------------+----------------------
1 6 | s 67958 10.553 | -s 148923
2 254 | p 54613 8.480 | -y 47117
3 3424 | c 52531 8.157 | -es 44027
4 17076 | a 40275 6.254 | -ed 30959
5 39706 | m 39643 6.156 | -ing 26456
6 64307 | b 35591 5.527 | -er 24640
7 79703 | t 33549 5.210 | -ly 18329
8 86685 | d 32672 5.073 | -ic 14648
9 84051 | r 28573 4.437 | -al 14462
10 75119 | u 27938 4.338 | -ion 10730
11 60622 | h 24919 3.869 | -ness 10188
12 46414 | e 23110 3.589 | -ies 9219
13 31752 | n 22547 3.501 | -tion 8979
14 21765 | g 21909 3.402 | -ous 7594
15 13917 | f 21153 3.285 | -ty 6317
16 8324 | l 20796 3.229 | -en 6151
17 4887 | i 20300 3.152 | -ate 6102
18 2648 | o 19877 3.087 | -able 5561
19 1466 | w 14276 2.217 | -ity 4944
20 898 | k 13979 2.171 | -ist 4521
21 378 | v 10262 1.594 | -est 4247
22 207 | j 6676 1.037 | -ite 3841
23 116 | z 3537 0.549 | -or 3532
24 80 | y 3206 0.498 | -ical 3469
25 46 | q 3027 0.470 | -ize 2857
26 33 | x 1064 0.165 | -ise 2747
27 19 | -------------------- | -ment 2160
28 18 | Tot 643981 100.000 | -ish 2152
29 16 | | -ial 2137
30 9 | 6 single letters | -less 1964
31 7 | not included | -ied 1216
32 8 | | -ship 1206
33 3 | | -ious 1113
34 4 | | -ence 1101
35 3 | | -ful 1095
36 3 | | -ance 1059
37 2 | | -sion 833
38 1 | | -ible 782
39 2 | | -fy 685
40 3 | | -ify 528
41 2 | |
43 1 | | Note: Many words aren't
45 2 | | counted because they don't
---------------- | | end with one of these 40
Total 643987 | | suffixes. Many words are
| counted 2 or 3 times or
2 longest words of 45 characters: | more. Both -ly and -ty
pneumonoultramicroscopicsilicovolcanoconioses | words are also counted as
pneumonoultramicroscopicsilicovolcanoconiosis | -y words. The same goes
| for many other endings.
By JCS 8/2017