6 minutes
Understanding Password Complexity
What does it mean to have a “strong” password? How can you be sure your password can’t be compromised? How do websites know when your password is complex enough?
These are questions I started asking myself in the beginning in February. And while it may sound trivial, I wanted to understand how those strength bars know how good my password is.
So let’s start with entropy. Entropy, in the Information Theory sense, measures how unpredictable text is based on the character set used and its length. And to answer these questions, we’ll need to look at password entropy, which measures the strength of the password generation. Meaning, the results from our calculations will signify how hard it is to guess a password under certain assumptions. This will all make sense when we take a look at how password entropy is calculated.
Let’s Talk About Math
Now let’s build out our function. We’ll need two variables  one for the length of our password (denoted as $L$) and another for the set of possible combinations with our dictionary (denoted as $N$). Dictionary, in this sense, means the character set used to generate the password, such as numbers (09) to create a PIN or your typical ASCII printable characters (az, AZ, 09, and special characters).
So, to calculate the number of possible passwords we’ll need to raise $N$ by the power of $L$, or $N^L$. The is the foundation of our equation. (N.B.: If we increase $L$ or $N$, we’ll increase the number of possible passwords, which will thereby increase the strength of our passwords generated. This will be demonstrated a bit later.)
Assuming that each symbol within the password is produced independently, we’ll use the binary logarithm, as the password (in the end) is in bits, on our foundation to determine the entropy. So:
$$ {entropy \ of \ password}\ = \ \log_{ 2 } N^l $$
With a little bit of manipulations, we can get the following result:
$$ {entropy \ of \ password}\ =\ \log_{ 2 } N^l =\ L\ (\log_{ 2 } N)\ = \ L \ \frac{ \log_{} N }{ \log_{} 2 } $$
Taking a step back, our equation resembles a rough version of Shannon’s entropy, which determines the probability of characters being represented within the password based on the sum of each individual character’s entropy. In short, Shannon’s entropy has guided our evolution for our password generation entropy equation.
How Easily Can Your Password be Guessed?
Well, it depends on the dictionary your password uses. So let’s put our new found powers to use!
Let’s take for example your normal password requirements of 8  16 characters, requiring at least one capital letter, a number, and a special character. What’s the max entropy we can generate? Time to churn our equation. Our dictionary set is 95 characters, and we’ll assume the max values for the password length and that the characters are independently generated from each other:
$$ 16\ \frac{ \log_{} 95 }{ \log_{} 2 } = 105 $$
Hence, the entropy of this password is 105 bits. Well, that’s all fine and well, but what does this result mean? I’m glad you thought that.
To calculate the results, we’ll use the following equation (you’ll notice it’s the inverse of our password entropy equation):
$$ seconds \ to \ guaranteed \ crack \ = \ \frac{ 2^{\ entropy} }{ guesses \ per \ second } $$
In 2012, a passwordcracking expert unveiled a computer that could cycle through 350 billion guesses per second. So, as an extreme case, let’s use this number:
$$ \frac{ 2^{105} }{ 350,000,000,000 } = \frac{ 9614081257132168796771975168 }{ 341796875 }\ seconds \ \approx \ 891.9 \ billion \ years $$
Geez! 105 password entropy seems relatively safe. Well, this is all on the assumption that this attack is happening offline (as most serverside securities prevents multifailed attempts of password guessing) or cracking something locally encrypted, and this is assuming there isn’t a stronger password cracker out there now. Notice that the entropy is only stronger when the guesses per second
is weaker.
So how much entropy is recommended for everyday use? Well, that depends on the application and its use. There seems to be a common census to use 80 bits of entropy for everyday use. Again, the stronger the better here. It should also go without saying that weak passwords should be excluded from your password generation.
How Can I Get a Really High Entropy Password?
Use a big dictionary. Remember that the more combinations your dictionary has, the stronger your potential entropy.
One of the biggest dictionaries with 7776 short words is Diceware. With this method, you could calculate a Diceware password yourself with five rolls of a dice. The end result of all these rolls will correspond to a word from Diceware’s dictionary (e.g., a 16656
roll corresponds to claw
). These words are short so that you can easily remember them. As the XKCD comic eloquently put it:
Let’s take a moment though to create a table for the needed amount of characters to achieve a certain amount of entropy^{1}:
Desired Entropy  Numbers (09)  Hexadecimal (09, AF)  Case insensitive Latin alphabet (az or AZ)  Case insensitive alphanumeric (az or AZ, 09)  Case sensitive Latin alphabet (az, AZ)  Case sensitive alphanumeric (az, AZ, 00)  All ASCII printable characters (without space)  All extended ASCII printable characters  Diceware word list 

8 bits  3  2  2  2  2  2  2  2  1 
32 bits  10  8  7  7  6  6  5  5  3 
40 bits  13  10  9  8  8  7  7  6  4 
64 bits  20  16  14  13  12  11  10  9  5 
80 bits  25  20  18  16  15  14  13  11  7 
96 bits  29  24  21  19  17  17  15  13  8 
128 bits  39  32  28  25  23  22  20  17  10 
160 bits  49  40  35  31  29  27  25  21  13 
192 bits  58  48  41  38  34  33  30  25  15 
224 bits  68  56  48  44  40  38  35  29  18 
256 bits  78  64  55  50  45  43  39  33  20 
Clearly, the winner here with the fewest amount of characters needed is Diceware, but coming in second is ASCII characters as one would expect.
N.B.: As measuring entropy relies on a randomly generated string. People are notoriously bad at generating random passwords. Our “randomness” will most likely come from things we use most often  like vowels. One analysis showed that over 3 million eightcharacter passwords, the letter “e” was used over 1.5 million times, where the letter “f” was only used 250,000 times. Ideally, if characters were evenly distributed throughout these passwords, each character would only be used around 900,000 times. In short? Store your randomly generated password with password manager, create your passwords with a random password generator, and don’t use commonly known passwords.

The table can be found https://en.wikipedia.org/wiki/Password_strength#Random_passwords ↩︎