Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even if you knew the pattern, four random lowercase dictionary words (assuming a dictionary size of 50,000 words) would take longer to crack than a randomly generated 10 digit password with letters, numbers, and special characters.

50,000^4 > (26+26+10+10)^10



It would be very interesting to see the results of a study asking people to come up with a list of random words. I really doubt that the actual dictionary size would be anywhere near 50k, and probably would have a high frequency of common words like 'apple', 'house', 'food' etc, making them easier to crack, and almost no frequency of less common words.


the assumption is you do not come up with your own words, and pick words at random from the whole dictionary.


I'm not sure I agree with that assumption, as the entire purpose of a passphrase of words rather than a password of random characters is that the passphrase should be easier to remember. If you're randomly picking words like 'gargarize-youster-noctivagant-axilla', it's not exactly accomplishing that purpose very well. It's also a huge PITA to type in, which based on my experience in the IAM space, is an immediate dealbreaker.


    $ egrep '^[a-z]{4,10}$' /usr/share/dict/words | wc
      50768   50768  433477
    $ for i in `seq 5`; do egrep '^[a-z]{4,10}$' /usr/share/dict/words | shuf -n 4 | xargs; done
    droned engraves developer manoeuvre
    lifeforms lurked pursuing subjugated
    hooligans underplay sudden command
    quartettes soapbox blacklist pigtails
    roughening chefs mortals earthy
In my experience, things like that are both easier to remember and to type than things like fa#klwgjl5235 - I type sequences of English words far more often than I type anything else.


I’d rather pick from obscure words I know than at random. In my case the words might lean tech/business/news/sports, but I’m sure I could come up with a good list. It might be interesting to try and generate passwords from a corpus of email and/or browsing history... assuming you blacklist sensitive subjects.


I let my password manager pick words for me, and I keep hitting refresh until I get one that I think I'm likely to get the spelling correct when needed.

1Password just gave me this: land convolve witchery bequest

Having said that, since I use 1Password, these are rare and almost exclusively used for things where I need very-short-term memorable passphrases for things that won't let me copy/paste from 1Password (like my Apple ID or the passwords my bank ask me for over the phone...) Everything else just gets 25 random chars (or the maximum number of chars the input will allow).


If i used that model i am pretty sure there would be some kind of proper noun or fantasy novel reference, meaning the Dictionary would need to be pretty extensive.


And - of course - Randall has us covered here too:

https://xkcd.com/1133/

A 1000 word vocabulary is quite understandable, though a little awkward.

A random word might be as few at 10 bits of entropy. If a person is picking them out of their head, I'd bet it's unlikely to be as many as 12 or 13 bits. Most of the words we "know" aren't ones that come to mind when we're "randomly choosing words"...


Its not quite fair to assume that people are choosing randomly from 50k words. Here is what those passwords look like. I excluded proper nouns and possessives. If you want to try, this command works on Ubuntu:

echo $(cat /usr/share/dict/american-english | grep -v \' | egrep -v '[A-Z].*' | shuf -n 4 | tr '\n' ' ')

trawled scratch protract sagings

perpetuates barium entreated credits

integrals virago chronicled weathercocks

foremasts milkmaid bashful maddened

disposes shrunk propose stanchion

midwived romantics gallbladders spotlighted


Good point. The EFF has made and published three lists of words to use that are easy to spell and generally easy to remember.

I wrote a command-line tool in Rust for generating passphrase using these wordlists. I use it myself any time I need a password.

My tool is fast, free of charge, open source and it can also tell you the entropy that will result for any given choice of number of words.

For example let’s say I want it to give me four words from the long wordlist, and I want to know how many bits of entropy this corresponds to.

    pgen -l -n 4 -e

    Current settings will create passphrases with 51.70 bits of entropy.
51.70 bits of entropy.

What does that mean, you might ask.

The Wikipedia article on password strength (https://en.wikipedia.org/wiki/Password_strength) explains it well:

> A password with an entropy of 42 bits calculated in this way would be as strong as a string of 42 bits chosen randomly, for example by a fair coin toss. Put another way, a password with an entropy of 42 bits would require 2^42 (4,398,046,511,104) attempts to exhaust all possibilities during a brute force search. Thus, by increasing the entropy of the password by one bit the number of guesses required doubles, making an attacker's task twice as difficult. On average, an attacker will have to try half the possible number of passwords before finding the correct one.

Ok, so how good is 51.70 bits of entropy, you ask?

Wikipedia, same article again:

> The minimum number of bits of entropy needed for a password depends on the threat model for the given application. [...] RFC 4086, "Randomness Requirements for Security", presents some example threat models and how to calculate the entropy desired for each one. Their answers vary between 29 bits of entropy needed if only online attacks are expected, and up to 96 bits of entropy needed for important cryptographic keys used in applications like encryption where the password or key needs to be secure for a long period of time and stretching isn't applicable.

So let's say that you are satisfied with 51.70 bits of entropy in this case. What does a password like that look like? Let's generate one.

    pgen -l -n 4

    plastic case refocus demise
Pretty memorable if you ask me :)

Oh yeah, and about the claim that it's fast. Just how fast is it? Have a look.

    time pgen -l -n 4

    browbeat hummus sandbox unfixable

    real    0m0.005s
    user    0m0.001s
    sys     0m0.006s
That's 5 milliseconds.

But hey, let's say we wanted to generate a bunch of passphrases at once.

How much time does it take to generate 10.000 passphrases and dump them into a text file?

    time pgen -l -n 4 -k 10000 > 10k.txt

    real    0m0.132s
    user    0m0.073s
    sys     0m0.058s
About zero point one seconds. Not that generating 10.000 passphrases is something that you are likely to do, but it just speaks to how fast this tool is ^^

Source and instructions on how to install it are on GitHub.

https://github.com/ctsrc/Pgen


What is the size of that dictionary?


About 60k words.


What I find interesting about the XKCD is that the entropy analysis is basically spot-on, even very generous towards the opposition. It shows that a) the typical "strong password" that people pick is not truly random (i.e. not 26+26+10+10, but worse), and b) that under these conditions even a 4-length pass phrase picked from a measly dictionary of 2048 words is better. (This is probably an even more interesting / compelling argument.)

Yet somehow, each time the XKCD is posted, someone will "point out" that pass phrases can be dictionary attacked, which is kind of like me saying "I know which letters are on your keyboard, and can therefore brute force your password!", but not having done the math beyond that.


This is really dependent on the size of the dictionary.


If your dictionary is smaller than 50,000 words, all you would have to do is add another word.

10,000^5 > 50,000^4


> assuming a dictionary size of 50,000 words




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: