You Are (Probably) Doing Login Systems Wrong

Fri, May 12, 2017

A thing, most programmers have tried at least once, is login systems. Despite being seemingly a simple task, it is in fact very hard to do right.

So, let's look into, how we can actually do this right.

Storing passwords

Okay, this is common knowledge: Salt and hash your passwords.

However, it is often done wrong. You'll see code like:

hash(password + salt)

This is better than unsalted, unhashed passwords, but it's far from bruteforce resistant. Why? Because hashing is cheap. With proper machines, you can do billions of them in a second. A dictionary attack is piece of cake.

So how do we solve this? Well, we use a KDF. A KDF (key derivation function) acts like a slow hash function. A hash function, where calculating takes maybe 100 ms. or more.

In general, two kinds of KDFs exists, the CPU-based and the CPU-memory hybrids. The CPU based are still in use, but I don't recommend using them, as they can easily be calculated with ASICs. The CPU-memory hybrids requires some amount of memory for calculating the hash value, often making it substantially harder to create ASICs.

For the reasons stated above, I recommend scrypt, for a modern, well-known an secure KDF.

Sessions

In general, sessions should be assigned a token by the server. This token is shared with the client (e.g. through a cookie) as the way to prove, that they are logged in with a given account to the server.

There is a few things to keep in mind, though.

The session token shall have an expiration date, for security reasons. Furthermore, the session token shall be deactivated when the user logs out.

One mistake often made in storing the session tokens is to store them in memory. This is wrong, as it means that crashes or restarts logs every user out, suspending all sessions. Instead, it ought to be stored in a database, possibly in a column of the users table.

Lastly, the session must not be shared through GET or other logged means. Rather, it should be stored in a cookie or localStorage.

Client-side hashing

A pretty uncommon, but really good practice, is the client-side hashing. It is supposed act as another layer of security, hiding the password from the server.

The idea is that the client-side should hash (e.g. scrypt) the password before sending it to the server.

This can seem pretty pointless, as (in the case of the web) the server could simply change the JavaScript to leak the password, but there is a reason: If the server-side has a bug that allows to read certain chosen memory locations (a buffer overflow, for example), it could be exploited to read the plaintext password.

Instead, with client-side hashing, it can only read the hashed value. This of course doesn't stop the hacker from logging in to the user's account, but it stops them from obtaining the potentially reused plain-text password.

Rate limiting

Rate limiting is really, really important, even though it is often understated. It prevents someone from bruteforcing common passwords.

But how should the rate limiting work?

One neat way is the "leaky bucket algorithm". It works by having requests "dripping" into a bucket like drops of water. When the leaky bucket is filled, no more requessts can be made, until the bucket has leaked to empty.

In less figurative language, you have a counter on every visitor IP address, which is incremented on every request (e.g. login, create account etc.). When this counter is above some level (say 5), a timeout is set (e.g. the time when it expire is set as a field of the user). First when this timeout expires, the counter resets, and new requests can be made.

Resetting accounts

Of course a proper system must have the ability to reset accounts. There are many ways of doing this.

I would recommend to have the user provide username and E-mail, as it—contrary to other approaches—does not allow spamming or denial of service attacks. It is important that you do not reveal whether or not the E-mail matched, as that can be a breach of privacy of the user (such thing can be used to check if the E-mail address matched).

If the E-mail is only going to be used for resetting, I strongly recommend that you store a fingerprint (e.g. scrypt) of the E-mail address rather than the plaintext version. This still ensures that you can check if the given E-mail address of the resetting user matches, without the server side knowing anything about the address. It prevents the database being misused or sold for spam purposes, and also helps to protect the user's identity.

When the user resets, a token shall be generated. This token is used in a link, sent to the user through an E-mail. It is important that this token is sufficiently long, random, and furthermore, that it expires (ideally within only a few hours).

Note that upon resetting, the token should removed from the map.

Other tips

Consider implementing a common two-factor key-sharing algorithm, such as TOTP.
I recommend that changing E-mail and password are done under the "reset password" formula, such that E-mail confirmation is required.
It is extremely important that the connection, where the login happens is secure, for example https.

Conclusion

Well, that's it.

You have to be careful about what you do, though. This is a minefield of vulnerabilities, and you have to be very careful not to introduce subtle bugs in your code. I recommend that you let somebody other than yourself review your code, to give you another's perspective on the code.