Jon Bentley says Doug McIlroy did it six lines of code in a UNIX Shell language.
Bentley, "Little languages", Communications of the ACM, 29(8):711-21, August> 1986
Here are 19 lines of PowerShell
Function Get-Top6Words {
param ($fileName="$pwd\big.txt")
Function train($text)
{
$h = @{}
$text = [string]::join(' ', $text)
ForEach ($word in [regex]::split($text.ToLower(), ‘\W+’) ) {
$h[$word] += 1
}
$h
}
(train ([System.IO.File]::ReadAllLines($fileName))).GetEnumerator() |
Sort-Object value -Descending |
Select-Object -First 6
}
Read a big.txt file.


{ 1 trackback }
{ 2 comments… read them below or add one }
How about one line
[regex]::split([io.file]::readAllText($fileName).ToLower(),’\W+’) | group -NoElement | sort count -desc | select -first 6
Thank you gentlemen, great updates.
I used the code I did in Spelling corrector, in vanilla PowerShell which I modelled after Peter Norvig’s python code they use in Google. How to Write a Spelling Corrector
I didn’t shortcut my thinking.