Most people use passwords many times a day. They’re the keys that unlock digital doors and give us access to our computers, our email, our data and sometimes even our money. As more and more things move online, passwords secure an ever growing part of our lives. We’re told to add capital letters, numbers and punctuation to these passwords to make them more secure, but just what difference do these have? What does a really secure password look like?
In order to answer these questions, we’re going to turn into an attacker and look at the methods used to crack passwords. There are a few password-cracking tools available for Linux, but we’re going to use John The Ripper, because it’s open source and is in most distros’ repositories (usually, the package is just called john).
In order to use it, we need something to try to crack. We’ve created a file with a set of MD5-hashed passwords; they’re all real passwords that were stolen from a website and posted on the internet. MD5 is quite an old hashing method, and we’re using it because it should be relatively quick to crack on most hardware. To make matters easier, all the hashes use the same salt. Although we’ve chosen a setup that’s quick to crack, this same setup is quite common in organisations that don’t focus on security. You can download the file from here.
After downloading that file, you can try and crack the passwords with:
The passwords in this file are all quite simple, and you should crack them all very quickly. Not all password hashes will surrender their secrets this easily.
When you run John The Ripper like this, it tries increasingly more complex sequences until it finds the password. If there are complex passwords, it may continue running for months or years unless you press Ctrl+C to terminate it.
Once this has finished running you can see what passwords it found with:
john --show md5s-short
That’s the simplest way of cracking passwords – and you’ve just seen that it can be quite effective – so now lets take a closer look at what just happened.
The speed at which John can crack hashes varies dramatically depending on the hashing algorithm. Slow algorithms (such as bcrypt) can be tens of thousands of times slower than quick ones like DES.
John The Ripper works by taking words from a dictionary, hashing them, and comparing these hashes with the ones you’re trying to crack. If the two hashes match, that’s the password you’re looking for. A crucial point in password cracking is how quickly you can perform these checks. You can see how fast john can run on your computer by entering:
This will benchmark a few different hashing algorithms and give their speeds in checks per second (c/s).
By default, John will run in single-threaded mode, but if you want to take full advantage of a multi-threaded approach, you can add the –fork=N option to the command where N is the number of processes. Typically, this is best where N is the number of CPU cores you want to dedicate to the task.
In the previous example, you probably found John cracked most of the passwords very quickly. This is because they were all common passwords. Since John works by checking a dictionary of words, common passwords are very easy to find.
John comes with a word list that it uses by default. This is quite good, but to crack more and more secure passwords, you then need a word list with more words. People who crack passwords regularly often build their own word lists over years, and they can come from many sources. General dictionaries are good places to start (which languages you pick will depend on your target demographic), but these don’t usually contain names, slang or other terms.
Crackers regularly steal passwords from organisations (often websites) and post them online. These password leaks may contain thousands or even millions of passwords, so these are a great source of extra words. Good word lists are often sold (such as https://crackstation.net/buy-crackstation-wordlist-password-cracking-dictionary.htm, which is pay-what-you-want). This latter has about 1.5 billion words; even larger word lists are available, but usually for a fee.
With John, you can use a custom word list with the –wordlist=<filename> option. For example, to check passwords using your system’s dictionary, use:
john --wordlist=/usr/share/dict/words md5s-short
This should work on most Debian-based systems, but on other distros, the words file may be in a different place. The first line deletes the file that contains the cracked passwords. If you don’t run this, it won’t bother trying to crack anything, as it already has all the passwords. The regular dictionary isn’t as good as John The Ripper’s dictionary, so this won’t get all the passwords.
Passwords present something of a computing conundrum. When people enter their password, the computer has to be able to check that they’ve entered the right password. At the same time though, it’s a bad idea to store passwords anywhere on the computer, since that would mean that any hacker or malware might be able to get the passwords file and then compromise every user account.
Hashing (AKA one-way encryption) is the solution to this problem. Hashing is a mathematical process that scrambles the password so that it’s impossible to unscramble it (hence one-way encryption).
When you set the password, the computer hashes it and stores the hash (but not the password). When you enter the password, the computer then hashes it and compares this hash to the stored hash. If they’re the same, then the computer assumes that the passwords are the same and therefore lets you log in.
There are a few things make a good hashing algorithm. Obviously, it should be impossible to reverse (otherwise it’s not a hashing algorithm), but other than this, it should minimise the number of collisions. This is where two different things produce the same hash, and the computer would therefore accept both as valid. It was a collision in the MD5 hashing algorithm that allowed the Flame malware to infiltrate the Iranian Oil Ministry and many other government organisations in the Middle East.
Another important thing about good hashing algorithms is that they’re slow. That might sound a little odd, since generally algorithms are designed to be fast, but the slower a hash is, the harder it is to crack. For normal use, it doesn’t make much difference if the hash takes 0.000001 seconds or 0.001 seconds, but the latter takes 1,000 times longer to crack.
You can get a reasonable idea of how fast or slow an algorithm is by running john –test to benchmark the different algorithms on your computer. The fewer checks per second, the slower it will be for an attacker to break any hashes using that algorithm.
Secure services often place rules on what passwords are allowed. For example, they might insist on upper and lower case letters as well as numbers or punctuation. In general, people won’t add these randomly, but put them in words in specific ways. For example, they might add a number to the end of a word, or replace letters in a word with punctuation that looks similar (such as a with @).
John The Ripper provides the tools to mangle words in this way, so that we can check these combinations from a normal word list.
For this example, we’ll use the password file from www.linuxvoice.com/passwords, which contains the passwords: password, Password, PASSWORD, password1, p@ssword, P@ssword, Pa55w0rd, p@55w0rd. First, create a new text file called passwordlist containing just:
This will be the dictionary, and we’ll create rules that crack all the passwords based of this one root word.
Rules are specified in the john.conf file. By default, john uses the configuration files in ~/.john, so you’ll need to create that file in a text editor. We’ll start by adding the lines:
The first line tells john what mode you want to use the rules for, end every line below that is a rule (we’ll add more in a minute). The : just tells John to try the word as it is, no alterations, while c stands for capitalise, which makes the first character of the word upper case. You can try this out with:
john passwords.md5 --wordlist=passwordlist --rules
You should now crack two of the passwords despite there only being one word in the dictionary. Let’s try and get a few more now. Add the following to the config file:
The first line here makes the whole word upper case. On the second line, the $ symbol means append the following character to the password. In this case, it’s not a single character, but a class of characters (digits), so it tries ten different words (password0, password1… password9).
To get the remaining passwords, you need to add the following rules to the config file:
The rule s<character1><character2> replaces all occurrences of character1 with character2. In the above rules, this is used to switch a for @ (sa@), o for 0 (so0) and s for 5 (ss5). All of these are combination rules that build up the final word through more than one alteration.
The faster your computer can hash passwords, the more you can try in a given amount of time, and therefore the better chance you have of cracking the password. In this article, we’ve used John The Ripper because it’s an open source tool that’s available on almost all Linux platforms. However, it’s not always the best option. John runs on the CPU, but password hashing can be run really efficiently on graphics cards.
Hashcat is password cracking program that runs on graphics cards, and on the right hardware can perform much better than John. Specialised password cracking computers usually have several high-performance GPUs and rely on these for their speed.
You probably won’t find Hashcat in your distro’s repositories, but you can download it from www.hashcat.net (it’s free as in zero cost, but not free as in free software). It comes in two flavours: ocl-Hashcat for OpenCL cards (AMD), and cuda-Hashcat for Nvidia cards.
Raw performance, of course, means very little without finesse, so fancy hardware with GPU crackers means very little if you don’t have a good set of words and rules.
A text-menu driven tool for creating John The Ripper config files is available from this page.
Limitations of cracking rules
The language for creating rules isn’t very expressive. For example, you can’t say: ‘try every combination of the following rules’. The reason for that is speed. The rules engine has to be able to run thousands or even millions of times per second while not significantly slowing down the hashing.
You’ve probably guessed by now that creating a good set of rules is quite a time-consuming process. It involves a detailed knowledge of what patterns are commonly used to create passwords, and an understanding of the archaic syntax used in the rules engines. It’s good to have an understanding of how they work, but unless you’re a professional penetration tester, it’s usually best to use a pre-created rule list.
The default rules with John are quite good, but there are some more complex ones available. One of the best public ones comes from a DefCon contest in 2010. You can grab the ruleset from the website: http://contest-2010.korelogic.com/rules.html.
You’ll get a file called rules.txt, which is a John The Ripper configuration file, and there are some usage examples on the above website. However, it’s not designed to work with the default version of John The Ripper, but a patched version (sometimes called -jumbo). This isn’t usually available in distro repositories, but it can be worth compiling it because it has more features than the default build. To get it, you’ll need to clone it from GitHub with:
git clone https://github.com/magnumripper/JohnTheRipper
There are a few options in the install procedure, and these are documented in JohnTheRipper/doc/Install. We compiled it on an Ubuntu 14.04 system with:
./configure && make -s clean && make -sj4
This will leave the binary JohnTheRipper/run/john that you can execute. It will expect the john.conf file (which can be the file downloaded from KoreLogic) in the same directory.
If you don’t want to compile the -jumbo version of John, you can still use the rules from KoreLogic, you’ll just have to integrate them into a john.conf file by hand first. There are a lot of rules, so you’ll probably want to pick out a few, and copy them into the john.conf file in the same way you did when creating the rules earlier, and omit the lines with square brackets.
As you’ve seen, cracking passwords is part art and part science. Although it’s often thought of as a malicious practice, there are some real positive benefits of it. For example, if you run an organisation, you can use cracking tools like John to audit the passwords people have chosen. If they can be cracked, then it’s time to talk to people about computer security. Some companies run periodic checks and offer a small reward for any employee whose password isn’t cracked. Obviously, all of these should be done with appropriate authorisation, and you should never use a password cracker to attack someone else’s password except when you have explicit permission.
John The Ripper is an incredibly powerful tool whose functionality we’ve only just touched on. Unfortunately, its more powerful features (such as its rule engine) aren’t well documented. If you’re interested in learning more about it, the best way of doing this is by generating hashes and seeing how to crack them. It’s easy to generate hashes by simply creating new users in your Linux system and giving them a password; then you can copy the /etc/shadow file to your home directory and change the owner with:
sudo cp /etc/shadow ~
sudo chown <username> ~/shadow
Where <username> is your username. You can then run John on the shadow file. If you’ve got a friend who’s interested in cracking as well, you could create challenges for each other (remember to delete the lines for real users from the shadow file though!). Alternatively, you can try our shadow file for the latest in our illustrious series of competitions.
So, what does a secure password look like? Well, it shouldn’t be based on a dictionary word. As you’ve seen, word mangling rules can find these even if you’ve obscured it with numbers or punctuation. It should also be long enough to make brute force attacks impossible (at least 10 characters). Beyond that, it’s best to use your own method, because any method that becomes popular can be exploited by attackers to create better word lists and rules.