HowToBe said:
In any case, it would probably be good to throw in a non-dictionary word that only you will think of, or maybe toss in a number or symbol somewhere
I'm curious - can someone math-experienced say if the "four words" password would be easier to break if someone tried to guess the words rather than the individual characters? I read that the English language has about 500,000 words, although I'm sure the commonly known ones are a smaller number. So if a person chooses a random sequence of four words from a dictionary of 5,000 words, how hard would it be to guess that sequence?
The number of possible passwords is the number of available characters to the power of how long the password is.
So let's say your password is 8 characters long, and consists of nothing but lower case letters. Since there are 26 letters in the alphabet, this is how many possible passwords you could get:
26*26*26*26*26*26*26*26 = 26^8 (26 to the power of 8) = 208,827,064,576 (208 billion possible variations)
If you incorporate capitals, which are treated as different characters, now it's 52 options (26+26). If you can include numbers, that's +10 on top of that, so 62 options. If you include special characters that exist on a typical keyboard that's about 32 more options for a total of 94. So now an 8 character password would have this many possible passwords:
8 character passwod: 94^8 = 6,095,689,385,410,816 (6 quadrillion possible passwords).
9 character passwod: 94^9 = 572,994,802,228,616,704 (572 quadrillion)
and so on.
If you're working with 4 words, let's say ignoring capitalization so only lower case, and working with a dictionary of 5000 words, it's this:
5000 * 5000 * 5000 * 5000 = 5000^4 = 625,000,000,000,000 (625 trillion).
Edit:
If you consider capital letters as different, so the word "building" is different from "BuilDING" etc, then you are working with a much larger variation of passwords, but it's impossible to know how large because in that case you would need to know how long each word in the dictionary is to calculate the difference. Suffice it to say, it would be over an order of magnitude more.
Considering most people don't pick a password that's a string of random letters/numbers/characters, it is easier to guess than the random 8 character password above since you can use dictionary words to guess it, and then do the capital/lowercase inversions and popular number substitutions for letters like instead of letter "I" try 1 and instead of "O" try 0, etc. So it cuts down on the number of guesses dramatically if it's not purely "brute force" (just trying every possible character variation at random).
With a 4 word pass phrase, most people wouldn't remember a truly random capitalization, so it would not take a completely random approach to crack it - most people, if they do use capitals, would use some sort of pattern to be easy to remember like every other letter is a capital, or first and last letters, or just the vowels, etc. Making it easier again. However, even with a 4 word passphrase that doesn't involve any capitals, there is no easy way to guess it except having to try to go through the 625 trillion random word combinations, so it's much more effective and doesn't lend itself to any way to lower it, assuming your pass phase is not a basic grammatically correct and sensible sentence, but then someone would need a "dictionary" that contains all the possible meaningful sentences, which I don't think anyone does, and there is no artificial intelligence yet that can be used to create it (except maybe secret ones). So a 4 word pass phrase is the safest by far - unless you use 8 letter/number/character passwords that are truly random characters and not just a mangled word, which is really hard to remember, especially if you use several passwords.
And in that 4 word pass phrase if one of your words isn't in the dictionary, or you add numbers to each word, it makes it way harder since now you can't use that 5000 word dictionary to crack it, now you really have to randomize, and the number of passwords is increased dramatically.
Let's say you add a number after every word like "somewhere2 people6 green5 excitement1"
Now each of these "words" isn't 1 out of 5000 possible words, but one out of 50,000 possible words since each word could have 1 of 10 possible numbers at the end, so that's 5000*10 = 50,000.
50,000^4 = 6,250,000,000,000,000,000 (6 quintillion), so it has 10 times more passwords than just plain words.