Even if you knew the pattern, four random lowercase dictionary words (assuming a dictionary size of 50,000 words) would take longer to crack than a randomly generated 10 digit password with letters, numbers, and special characters.
It would be very interesting to see the results of a study asking people to come up with a list of random words. I really doubt that the actual dictionary size would be anywhere near 50k, and probably would have a high frequency of common words like 'apple', 'house', 'food' etc, making them easier to crack, and almost no frequency of less common words.
I'm not sure I agree with that assumption, as the entire purpose of a passphrase of words rather than a password of random characters is that the passphrase should be easier to remember. If you're randomly picking words like 'gargarize-youster-noctivagant-axilla', it's not exactly accomplishing that purpose very well. It's also a huge PITA to type in, which based on my experience in the IAM space, is an immediate dealbreaker.
In my experience, things like that are both easier to remember and to type than things like fa#klwgjl5235 - I type sequences of English words far more often than I type anything else.
I’d rather pick from obscure words I know than at random. In my case the words might lean tech/business/news/sports, but I’m sure I could come up with a good list. It might be interesting to try and generate passwords from a corpus of email and/or browsing history... assuming you blacklist sensitive subjects.
I let my password manager pick words for me, and I keep hitting refresh until I get one that I think I'm likely to get the spelling correct when needed.
1Password just gave me this: land convolve witchery bequest
Having said that, since I use 1Password, these are rare and almost exclusively used for things where I need very-short-term memorable passphrases for things that won't let me copy/paste from 1Password (like my Apple ID or the passwords my bank ask me for over the phone...) Everything else just gets 25 random chars (or the maximum number of chars the input will allow).
If i used that model i am pretty sure there would be some kind of proper noun or fantasy novel reference, meaning the Dictionary would need to be pretty extensive.
A 1000 word vocabulary is quite understandable, though a little awkward.
A random word might be as few at 10 bits of entropy. If a person is picking them out of their head, I'd bet it's unlikely to be as many as 12 or 13 bits. Most of the words we "know" aren't ones that come to mind when we're "randomly choosing words"...
Its not quite fair to assume that people are choosing randomly from 50k words. Here is what those passwords look like. I excluded proper nouns and possessives. If you want to try, this command works on Ubuntu:
> A password with an entropy of 42 bits calculated in this way would be as strong as a string of 42 bits chosen randomly, for example by a fair coin toss. Put another way, a password with an entropy of 42 bits would require 2^42 (4,398,046,511,104) attempts to exhaust all possibilities during a brute force search. Thus, by increasing the entropy of the password by one bit the number of guesses required doubles, making an attacker's task twice as difficult. On average, an attacker will have to try half the possible number of passwords before finding the correct one.
Ok, so how good is 51.70 bits of entropy, you ask?
Wikipedia, same article again:
> The minimum number of bits of entropy needed for a password depends on the threat model for the given application. [...] RFC 4086, "Randomness Requirements for Security", presents some example threat models and how to calculate the entropy desired for each one. Their answers vary between 29 bits of entropy needed if only online attacks are expected, and up to 96 bits of entropy needed for important cryptographic keys used in applications like encryption where the password or key needs to be secure for a long period of time and stretching isn't applicable.
So let's say that you are satisfied with 51.70 bits of entropy in this case. What does a password like that look like? Let's generate one.
pgen -l -n 4
plastic case refocus demise
Pretty memorable if you ask me :)
Oh yeah, and about the claim that it's fast. Just how fast is it? Have a look.
time pgen -l -n 4
browbeat hummus sandbox unfixable
real 0m0.005s
user 0m0.001s
sys 0m0.006s
That's 5 milliseconds.
But hey, let's say we wanted to generate a bunch of passphrases at once.
How much time does it take to generate 10.000 passphrases and dump them into a text file?
time pgen -l -n 4 -k 10000 > 10k.txt
real 0m0.132s
user 0m0.073s
sys 0m0.058s
About zero point one seconds. Not that generating 10.000 passphrases is something that you are likely to do, but it just speaks to how fast this tool is ^^
Source and instructions on how to install it are on GitHub.
What I find interesting about the XKCD is that the
entropy analysis is basically spot-on, even very generous towards the opposition. It shows that a) the typical "strong password" that people pick is not truly random (i.e. not 26+26+10+10, but worse), and b) that under these conditions even a 4-length pass phrase picked from a measly dictionary of 2048 words is better. (This is probably an even more interesting / compelling argument.)
Yet somehow, each time the XKCD is posted, someone will "point out" that pass phrases can be dictionary attacked, which is kind of like me saying "I know which letters are on your keyboard, and can therefore brute force your password!", but not having done the math beyond that.
50,000^4 > (26+26+10+10)^10