As the world moves towards cloud-based storage and computing, the task of storing our data on a PC hard disk is being replaced with cloud-based storage providers. This includes our emails, social data, professional data, and financial data. Accessing this data requires authentication, despite its various limitations, username and password are still the standard way of authentication [though OpenID is slowly becoming popular]. One thing which is crucial in this case is how web services store user’s password.

The password is more important than data

  1. Access to password implies access to user’s data (but not vice-versa) – of course, unless the web service enforces at least one other factor for authentication.
  2. Password reuse is extremely common – if someone is able to get access to user’s password on a social networking site, it can be used to compromise his data stored on (say) email service.
  3. Unethical employees – an unethical employee who might have (legitimate) access to user’s data [for technical reasons], might find it tough to read/manipulate user data in a corporate environment where [due to network monitoring] since chances of getting caught are pretty high. If he is able to access user’s password, he can access user’s data from the comfort of home [and behind anonymous proxies]

If a user is accessing a web service over SSL, only the user [or user’s browser to be precise] and the web service gets to see the password. While spear phishing, dns poisoning (with fake certificates) are popular ways used by attackers to get the password from the user, password leaks as a consequence of the attack on servers not that uncommon anymore. Especially, it is easy for an attacker to target small companies, harvest password [and reuse them to gain access to user’s data on other sites].

How web services store password
There are three primary ways for a web service to store password –

  1. Plaintext
    The password is stored “as-is”, an attacker who has gained control can read the passwords. An in-house encryption technique is no different from plain text password since it has not been proved to be unbreakable.
  2. Encrypted
    The password can be recovered using a key whose access is presumably more limited than encrypted data itself. But this can freak out tech-savvy people and is also bad, since if any employee who is able to get to see the password once, can always access data later on. (even from an external network)
  3. One-way hash (best choice)
    One way hash has the property that one can verify whether the user knows the password or not without actually storing the password itself. The password-to-hash function is fast while there are no known ways to reverse it (except brute-force which is extremely slow). A properly chosen hash (eg. SHA-1 hash) is the best choice. Gawker used md5 which can be cracked using brute-force.

Need for disclosure
Given that passwords are crucial, what we need is a self-disclosure policy on the part of web services to tell us how they store our password. And yeah, this should be a one-line disclosure and not pages full of terms-of-service which only lawyers understand. This disclosure should also further include whether the password gets logged as a part of regular network activity or not. Business Insider alleged that passwords saved as a part of normal network logging activity have been used earlier.