HOME > RSS > TECHNOLOGY > Coding Horror

R S S : Coding Horror


PageRank : 8 %

VoteRank :
(0 - 0 vote)





tagsTags: , , , , , , , , , , , , ,


English

RSS FEED READER



Hacker, Hack Thyself

2 June, by Jeff Atwood[ —]

We've read so many sad stories about communities that were fatally compromised or destroyed due to security exploits. We took that lesson to heart when we founded the Discourse project; we endeavor to build open source software that is secure and safe for communities by default, even if there are thousands, or millions, of them out there.

However, we also value portability, the ability to get your data into and out of Discourse at will. This is why Discourse, unlike other forum software, defaults to a Creative Commons license. As a basic user on any Discourse you can easily export and download all your posts right from your user page.

Discourse Download All Posts

As a site owner, you can easily back up and restore your entire site database from the admin panel, right in your web browser. Automated weekly backups are set up for you out of the box, too. I'm not the world's foremost expert on backups for nothing, man!

Discourse database backup download

Over the years, we've learned that balancing security and data portability can be tricky. You bet your sweet ASCII a full database download is what hackers start working toward the minute they gain any kind of foothold in your system. It's the ultimate prize.

To mitigate this threat, we've slowly tightened restrictions around Discourse backups in various ways:

  • Administrators have a minimum password length of 15 characters.

  • Both backup creation and backup download administrator actions are formally logged.

  • Backup download tokens are single use and emailed to the address of the administrator, to confirm that user has full control over the email address.

The name of the security game is defense in depth, so all these hardening steps help … but we still need to assume that Internet Bad Guys will somehow get a copy of your database. And then what? Well, what's in the database?

  • Identity cookies

    Cookies are, of course, how the browser can tell who you are. Cookies are usually stored as hashes, rather than the actual cookie value, so having the hash doesn't let you impersonate the target user. Furthermore, most modern web frameworks rapidly cycle cookies, so they are only valid for a brief 10 to 15 minute window anyway.

  • Email addresses

    Although users have reason to be concerned about their emails being exposed, very few people treat their email address as anything particularly precious these days.

  • All posts and topic content

    Let's assume for the sake of argument that this is a fully public site and nobody was posting anything particularly sensitive there. So we're not worried, at least for now, about trade secrets or other privileged information being revealed, since they were all public posts anyway. If we were, that's a whole other blog post I can write at a later date.

  • Password hashes

    What's left is the password hashes. And that's … a serious problem indeed.

Now that the attacker has your database, they can crack your password hashes with large scale offline attacks, using the full resources of any cloud they can afford. And once they've cracked a particular password hash, they can log in as that user … forever. Or at least until that user changes their password.

⚠️ That's why, if you know (or even suspect!) your database was exposed, the very first thing you should do is reset everyone's password.

Discourse database password hashes

But what if you don't know? Should you preemptively reset everyone's password every 30 days, like the world's worst bigco IT departments? That's downright user hostile, and leads to serious pathologies of its own. The reality is that you probably won't know when your database has been exposed, at least not until it's too late to do anything about it. So it's crucial to slow the attackers down, to give yourself time to deal with it and respond.

Thus, the only real protection you can offer your users is just how resistant to attack your stored password hashes are. There are two factors that go into password hash strength:

  1. The hashing algorithm. As slow as possible, and ideally designed to be especially slow on GPUs for reasons that will become painfully obvious about 5 paragraphs from now.

  2. The work factor or number of iterations. Set this as high as possible, without opening yourself up to a possible denial of service attack.

I've seen guidance that said you should set the overall work factor high enough that hashing a password takes at least 8ms on the target platform. It turns out Sam Saffron, one of my Discourse co-founders, made a good call back in 2013 when he selected the NIST recommendation of PBKDF2-HMAC-SHA256 and 64k iterations. We measured, and that indeed takes roughly 8ms using our existing Ruby login code on our current (fairly high end, Skylake 4.0 Ghz) servers.

But that was 4 years ago. Exactly how secure are our password hashes in the database today? Or 4 years from now, or 10 years from now? We're building open source software for the long haul, and we need to be sure we are making reasonable decisions that protect everyone. So in the spirit of designing for evil, it's time to put on our Darth Helmet and play the bad guy – let's crack our own hashes!

We're gonna use the biggest, baddest single GPU out there at the moment, the GTX 1080 Ti. As a point of reference, for PBKDF2-HMAC-SHA256 the 1080 achieves 1180 kH/s, whereas the 1080 Ti achieves 1640 kH/s. In a single video card generation the attack hash rate has increased nearly 40 percent. Ponder that.

First, a tiny hello world test to see if things are working. I downloaded hashcat. I logged into our demo at try.discourse.org and created a new account with the password 0234567890; I checked the database, and this generated the following values in the hash and salt database columns for that new user:

hash
93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=
salt
ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=

Hashcat requires the following input file format: one line per hash, with the hash type, number of iterations, salt and hash (base64 encoded) separated by colons:

type   iter  salt                                         hash
sha256:64000:ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=:93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=

Let's hashcat it up and see if it works:

./h64 -a 3 -m 10900 .\one-hash.txt 0234567?d?d?d

Note that this is an intentionally tiny amount of work, it's only guessing three digits. And sure enough, we cracked it fast! See the password there on the end? We got it.

sha256:64000:ZWVhZWQ4YjZmODU4Mzc0M2E2ZDRlNjBkNjY3YzE2ODA=:93LlpbKZKficWfV9jjQNOSp39MT0pDPtYx7/gBLl5jw=:0234567890

Now that we know it works, let's get down to business. But we'll start easy. How long does it take to brute force attack the easiest possible Discourse password, 8 numbers – that's "only" 108 combinations, a little over one hundred million.

Hash.Type........: PBKDF2-HMAC-SHA256
Time.Estimated...: Fri Jun 02 00:15:37 2017 (1 hour, 0 mins)
Guess.Mask.......: ?d?d?d?d?d?d?d?d [8]

Even with a top of the line GPU that's … OK, I guess. Remember this is just one hash we're testing against, so you'd need one hour per row (user) in the table. And I have more bad news for you: Discourse hasn't allowed 8 character passwords for quite some time now. How long does it take if we try longer numeric passwords?

?d?d?d?d?d?d?d?d?d [9]
Fri Jun 02 10:34:42 2017 (11 hours, 18 mins)

?d?d?d?d?d?d?d?d?d?d [10]
Tue Jun 06 17:25:19 2017 (4 days, 18 hours)

?d?d?d?d?d?d?d?d?d?d?d [11]
Mon Jul 17 23:26:06 2017 (46 days, 0 hours)

?d?d?d?d?d?d?d?d?d?d?d?d [12]
Tue Jul 31 23:58:30 2018 (1 year, 60 days)

But all digit passwords are easy mode, for babies! How about some real passwords that use at least lowercase letters, or lowercase + uppercase + digits?

Guess.Mask.......: ?l?l?l?l?l?l?l?l [8]
Time.Estimated...: Mon Sep 04 10:06:00 2017 (94 days, 10 hours)

Guess.Mask.......: ?1?1?1?1?1?1?1?1 [8] (-1 = ?l?u?d)
Time.Estimated...: Sun Aug 02 09:29:48 2020 (3 years, 61 days)

A brute force try-every-single-letter-and-number attack is not looking so hot for us at this point, even with a high end GPU. But what if we divided the number by eightby putting eight video cards in a single machine? That's well within the reach of a small business budget or a wealthy individual. Unfortunately, dividing 38 months by 8 isn't such a dramatic reduction in the time to attack. Instead, let's talk about nation state attacks where they have the budget to throw thousands of these GPUs at the problem (1.1 days), maybe even tens of thousands (2.7 hours), then … yes. Even allowing for 10 character password minimums, you are in serious trouble at that point.

If we want Discourse to be nation state attack resistant, clearly we'll need to do better. Hashcat has a handy benchmark mode, and here's a sorted list of the strongest (slowest) hashes that Hashcat knows about benchmarked on a rig with 8 Nvidia GTX 1080 GPUs. Of the things I recognize on that list, bcrypt, scrypt and PBKDF2-HMAC-SHA512 stand out.

My quick hashcat results gave me some confidence that we weren't doing anything terribly wrong with the Discourse password hashes stored in the database. But I wanted to be completely sure, so I hired someone with a background in security and penetration testing to, under a signed NDA, try cracking the password hashes of two live and very popular Discourse sites we currently host.

I was provided two sets of password hashes from two different Discourse communities, containing 5,909 and 6,088 hashes respectively. Both used the PBKDF2-HMAC-SHA256 algorithm with a work factor of 64k. Using hashcat, my Nvidia GTX 1080 Ti GPU generated these hashes at a rate of ~27,000/sec.

Common to all discourse communities are various password requirements:

  • All users must have a minimum password length of 10 characters.
  • All administrators must have a minimum password length of 15 characters.
  • Users cannot use any password matching a blacklist of the 10,000 most commonly used passwords.
  • Users can choose to create a username and password or use various third party authentication mechanisms (Google, Facebook, Twitter, etc). If this option is selected, a secure random 32 character password is autogenerated. It is not possible to know whether any given password is human entered, or autogenerated.

Using common password lists and masks, I cracked 39 of the 11,997 hashes in about three weeks, 25 from the ████████ community and 14 from the ████████ community.

This is a security researcher who commonly runs these kinds of audits, so all of the attacks used wordlists, along with known effective patterns and masks derived from the researcher's previous password cracking experience, instead of raw brute force. That recovered the following passwords (and one duplicate):

007007bond
123password
1qaz2wsx3e
A3eilm2s2y
Alexander12
alexander18
belladonna2
Charlie123
Chocolate1
christopher8
Elizabeth1
Enterprise01
Freedom123
greengrass123
hellothere01
I123456789
Iamawesome
khristopher
l1ghthouse
l3tm3innow
Neversaynever
password1235
pittsburgh1
Playstation2
Playstation3
Qwerty1234
Qwertyuiop1
qwertyuiop1234567890
Spartan117
springfield0
Starcraft2
strawberry1
Summertime
Testing123
testing1234
thecakeisalie02
Thirteen13
Welcome123

If we multiply this effort by 8, and double the amount of time allowed, it's conceivable that a very motivated attacker, or one with a sophisticated set of wordlists and masks, could eventually recover 39 × 16 = 624 passwords, or about five percent of the total users. That's reasonable, but higher than I would like. We absolutely plan to add a hash type table in future versions of Discourse, so we can switch to an even more secure (read: much slower) password hashing scheme in the next year or two.

bcrypt $2*$, Blowfish (Unix)
  20273 H/s

scrypt
  886.5 kH/s

PBKDF2-HMAC-SHA512
  542.6 kH/s 

PBKDF2-HMAC-SHA256
 1646.7 kH/s 

After this exercise, I now have a much deeper understanding of our worst case security scenario, a database compromise combined with a professional offline password hashing attack. I can also more confidently recommend and stand behind our engineering work in making Discourse secure for everyone. So if, like me, you're not entirely sure you are doing things securely, it's time to put those assumptions to the test. Don't wait around for hackers to attack you — hacker, hack thyself!

[advertisement] At Stack Overflow, we put developers first. We already help you find answers to your tough coding questions; now let us help you find your next job.

Thunderbolting Your Video Card

24 March, by Jeff Atwood[ —]

When I wrote about The Golden Age of x86 Gaming, I implied that, in the future, it might be an interesting, albeit expensive, idea to upgrade your video card via an external Thunderbolt 3 enclosure.

I'm here to report that the future is now.

Yes, that's right, I paid $500 for an external Thunderbolt 3 enclosure to fit a $600 video card, all to enable a plug-in upgrade of a GPU on a Skull Canyon NUC that itself cost around $1000 fully built. I know, it sounds crazy, and … OK fine, I won't argue with you. It's crazy.

This matters mostly because of 4k, aka 2160p, aka 3840 × 2160, aka Ultra HD.

4k compared to 1080p

Plain old regular HD, aka 1080p, aka 1920 × 1080, is one quarter the size of 4k, and ¼ the work. By today's GPU standards HD is pretty much easy mode these days. It's not even interesting. No offense to console fans, or anything.

Late in 2016, I got a 4k OLED display and it … kind of blew my mind. I have never seen blacks so black, colors so vivid, on a display so thin. It made my previous 2008 era Panasonic plasma set look lame. It's so good that I'm now a little angry that every display that my eyes touch isn't OLED already. I even got into nerd fights over it, and to be honest, I'd still throw down for OLED. It is legitimately that good. Come at me, bro.

Don't believe me? Well, guess which display in the below picture is OLED? Go on, guess:

Guess which screen is OLED?

There's a reason every site that reviews TVs had to recalibrate their results when they reviewed the 2016 OLED sets.

In my extended review at Reference Home Theater, I call it “the best looking TV I’ve ever reviewed.” But we aren’t alone in loving the E6. Vincent Teoh at HDTVtest writes, “We’re not even going to qualify the following endorsement: if you can afford it, this is the TV to buy.” Rtings.com gave the E6 OLED the highest score of any TV the site has ever tested. Reviewed.com awarded it a 9.9 out of 10, with only the LG G6 OLED (which offers the same image but better styling and sound for $2,000 more) coming out ahead.

But I digress.

Playing games at 1080p in my living room was already possible. But now that I have an incredible 4k display in the living room, it's a whole other level of difficulty. Not just twice as hard – and remember current consoles barely manage to eke out 1080p at 30fps in most games – but four times as hard. That's where external GPU power comes in.

The cool technology underpinning all of this is Thunderbolt 3. The thunderbolt cable bundled with the Razer Core is rather … diminutive. There's a reason for this.

Is there a maximum cable length for Thunderbolt 3 technology?

Thunderbolt 3 passive cables have maximum lengths.

  • 0.5m TB 3 (40Gbps)
  • 1.0m TB 3 (20Gbps)
  • 2.0m TB 3 (20Gbps)

In the future we will offer active cables which will provide 40Gbps of bandwidth at longer lengths.

40Gbps is, for the record, an insane amount of bandwidth. Let's use our rule of thumb based on ultra common gigabit ethernet, that 1 gigabit = 120 megabytes/second, and we arrive at 4.8 gigabytes/second. Zow.

That's more than enough bandwidth to run even the highest of high end video cards, but it is not without overhead. There's a mild performance hit for running the card externally, on the order of 15%. There's also a further performance hit of 10% if you are in "loopback" mode on a laptop where you don't have an external display, so the video frames have to be shuttled back from the GPU to the internal laptop display.

This may look like a gamer-only thing, but surprisingly, it isn't. What you get is the general purpose ability to attach any PCI express card to any computer with a Thunderbolt 3 port and, for the most part, it just works!

Linus breaks it down and answers all your most difficult questions:

Please watch the above video closely if you're actually interested in this stuff; it is essential. I'll add some caveats of my own after working with the Razer Core for a while:

  • Make sure the video card you plan to put into the Razer Core is not too tall, or too wide. You can tell if a card is going to be too tall by looking at pictures of the mounting rear bracket. If the card extends significantly above the standard rear mounting bracket, it won't fit. If the card takes more than 2 slots in width, it also won't fit, but this is more rare. Depth (length) is rarely an issue.

  • There are four fans in the Razer Core and although it is reasonably quiet, it's not super silent or anything. You may want to mod the fans. The Razer Core is a remarkably simple device, internally, it's really just a power supply, some Thunderbolt 3 bridge logic, and a PCI express slot. I agree with Linus that the #1 area Razer could improve in the future, beyond generally getting the price down, is to use fewer and larger fans that run quieter.

  • If you're putting a heavy hitter GPU in the Razer Core, I'd try to avoid blower style cards (the ones that exhaust heat from the rear) in favor of those that cool with large fans blowing down and around the card. Dissipating 150w+ is no mean feat and you'll definitely need to keep the enclosure in open air … and of course within 0.5 meters of the computer it's connected to.

  • There is no visible external power switch on the Razer Core. It doesn't power on until you connect a TB3 cable to it. I was totally not expecting that. But once connected, it powers up and the Windows 10 Thunderbolt 3 drivers kick in and ask you to authorize the device, which I did (always authorize). Then it spun a bit, detected the new GPU, and suddenly I had multiple graphics card active on the same computer. I also installed the latest Nvidia drivers just to make sure everything was ship shape.

  • It's kinda ... weird having multiple GPUs simultaneously active. I wanted to make the Razer Core display the only display, but you can't really turn off the built in GPU – you can select "only use display 2", that's all. I got into several weird states where windows were opening on the other display and I had to mess around a fair bit to get things locked down to just one display. You may want to consider whether you have both "displays" connected for troubleshooting, or not.

And then, there I am, playing Lego Marvel in splitscreen co-op at glorious 3840 × 2160 UltraHD resolution on an amazing OLED display with my son. It is incredible.

Beyond the technical "because I could", I am wildly optimistic about the future of external Thunderbolt 3 expansion boxes, and here's why:

  • The main expense and bottleneck in any stonking gaming rig is, by far, the GPU. It's also the item you are most likely to need to replace a year or two from now.

  • The CPU and memory speeds available today are so comically fast that any device with a low-end i3-7100 for $120 will make zero difference in real world gaming at 1080p or higher … if you're OK with 30fps minimum. If you bump up to $200, you can get a quad-core i5-7500 that guarantees you 60fps minimum everywhere.

  • If you prefer a small system or a laptop, an external GPU makes it so much more flexible. Because CPU and memory speeds are already so fast, 99.9% of the time your bottleneck is the GPU, and almost any small device you can buy with a Thunderbolt 3 port can now magically transform into a potent gaming rig with a single plug. Thunderbolt 3 may be a bit cutting edge today, but more and more devices are shipping with Thunderbolt 3. Within a few years, I predict TB3 ports will be as common as USB3 ports.

  • A general purpose external PCI express enclosure will be usable for a very long time. My last seven video card upgrades were plug and play PCI Express cards that would have worked fine in any computer I've built in the last ten years.

  • External GPUs are not meaningfully bottlenecked by Thunderbolt 3 bandwidth; the impact is 15% to 25%, and perhaps even less over time as drivers and implementations mature. While Thunderbolt 3 has "only" PCI Express x4 bandwidth, many benchmarkers have noted that GPUs moving from PCI Express x16 to x8 has almost no effect on performance. And there's always Thunderbolt 4 on the horizon.

The future, as they say, is already here – it's just not evenly distributed.

I am painfully aware that costs need to come down. Way, way down. The $499 Razer Core is well made, on the vanguard of what's possible, a harbinger of the future, and fantastically enough, it does even more than what it says on the tin. But it's not exactly affordable.

I would absolutely love to see a modest, dedicated $200 external Thunderbolt 3 box that included an inexpensive current-gen GPU. This would clobber any onboard GPU on the planet. Let's compare my Skull Canyon NUC, which has Intel's fastest ever, PS4 class embedded GPU, with the modest $150 GeForce GTX 1050 Ti:

1920 × 1080 high detail
Bioshock Infinite15 → 79 fps
Rise of the Tomb Raider12 → 49 fps
Overwatch43 → 114 fps

As predicted, that's a 3x-5x stompdown. Mac users lamenting their general lack of upgradeability, hear me: this sort of box is exactly what you want and need. Imagine if Apple was to embrace upgrading their laptops and all-in-one systems via Thunderbolt 3.

I know, I know. It's a stretch. But a man can dream … of externally upgradeable GPUs. That are too expensive, sure, but they are here, right now, today. They'll only get cheaper over time.

[advertisement] Find a better job the Stack Overflow way - what you need when you need it, no spam, and no scams.

Password Rules Are Bullshit

10 March, by Jeff Atwood[ —]

Of the many, many, many bad things about passwords, you know what the worst is? Password rules.

Let this pledge be duly noted on the permanent record of the Internet. I don't know if there's an afterlife, but I'll be finding out soon enough, and I plan to go out mad as hell.

The world is absolutely awash in terrible password rules:

But I don't need to tell you this. The more likely you are to use a truly random password generation tool, like us über-geeks are supposed to, the more likely you have suffered mightily – and daily – under this regime.

Have you seen the classic XKCD about passwords?

To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.

We can certainly debate whether "correct horse battery staple" is a viable password strategy or not, but the argument here is mostly that length matters.

That's What She Said

No, seriously, it does. I'll go so far as to say your password is too damn short. These days, given the state of cloud computing and GPU password hash cracking, any password of 8 characters or less is perilously close to no password at all.

So then perhaps we have one rule, that passwords must not be short. A long password is much more likely to be secure than a short one … right?

What about this four character password?

✅B a️

What about this eight character password?

正确马电池订书钉

Or this (hypothetical, but all too real) seven character password?

ش导พิ한✌︎

You may also be surprised, if you paste the above four Unicode emojis into your favorite login dialog (go ahead – try it), to discover that it … isn't in fact four characters.

Oh dear.

"*".length === 2

Our old pal Unicode strikes again.

As it turns out, even the simple rule that "your password must be of reasonable length" … ain't necessarily so. Particularly if we stop thinking like Ugly ASCII Americans.

And what of those nice, long passwords? Are they always secure?

aaaaaaaaaaaaaaaaaaa
0123456789012345689
passwordpassword
usernamepassword

Of course not, because have you met any users lately?

I changed all my passwords to

They consistently ruin every piece of software I've ever written. Yes, yes, I know you, Mr. or Ms. über-geek, know all about the concept of entropy. But expressing your love of entropy as terrible, idiosyncratic password rules …

  • must contain uppercase
  • must contain lowercase
  • must contain a number
  • must contain a special character

… is a spectacular failure of imagination in a world of Unicode and Emoji.

As we built Discourse, I discovered that the login dialog was a remarkably complex piece of software, despite its surface simplicity. The primary password rule we used was also the simplest one: length. Since I wrote that, we've already increased our minimum password default length from 8 to 10 characters. And if you happen to be an admin or moderator, we decided the minimum has to be even more, 15 characters.

I also advocated checking passwords against the 100,000 most common passwords. If you look at 10 million passwords from data breaches in 2016, you'll find the top 25 most used passwords are:

123456
123456789
qwerty
12345678
111111
1234567890
1234567
password
123123
987654321
qwertyuiop
mynoob
123321
666666
18atcskd2w
7777777
1q2w3e4r
654321
555555
3rjs1la7qe
google
1q2w3e4r5t
123qwe
zxcvbnm
1q2w3e

Even this data betrays some ASCII-centrism. The numbers are the same in any culture I suppose, but I find it hard to believe the average Chinese person will ever choose the passwords "password", "quertyuiop", or "mynoob". So this list has to be customizable, localizable.

(One interesting idea is to search for common shorter password matches inside longer passwords, but I think this would cause too many false positives.)

If you examine the data, this also turns into an argument in favor of password length. Note that only 5 of the top 25 passwords are 10 characters, so if we require 10 character passwords, we've already reduced our exposure to the most common passwords by 80%. I saw this originally when I gathered millions and millions of leaked passwords for Discourse research, then filtered the list down to just those passwords reflecting our new minimum requirement of 10 characters or more.

It suddenly became a tiny list. (If you've done similar common password research, please do share your results in the comments.)

I'd like to offer the following common sense advice to my fellow developers:

1. Password rules are bullshit

  • They don't work.
  • They heavily penalize your ideal audience, people that use real random password generators. Hey guess what, that password randomly didn't have a number or symbol in it. I just double checked my math textbook, and yep, it's possible. I'm pretty sure.
  • They frustrate average users, who then become uncooperative and use "creative" workarounds that make their passwords less secure.
  • They are often wrong, in the sense that the rules chosen are grossly incomplete and/or insane, per the many shaming links I've shared above.
  • Seriously, for the love of God, stop with this arbitrary password rule nonsense already. If you won't take my word for it, read this 2016 NIST password rules recommendation. It's right there, "no composition rules". However, I do see one error, it should have said "no bullshit composition rules".

2. Enforce a minimum Unicode password length

One rule is at least easy to remember, understand, and enforce. This is the proverbial one rule to bring them all, and in the darkness bind them.

  • It's simple. Users can count. Most of them, anyway.
  • It works. The data shows us it works; just download any common password list of your choice and group by password length.
  • The math doesn't lie. All other things being equal, a longer password will be more random – and thus more secure – than a short password.
  • Accept that even this one rule isn't inviolate. A minimum password length of 6 on a Chinese site might be perfectly reasonable. A 20 character password can be ridiculously insecure.
  • If you don't allow (almost) every single unicode character in the password input field, you are probably doing it wrong.
  • It's a bit of an implementation detail, but make sure maximum password length is reasonable as well.

3. Check for common passwords

As I've already noted, the definition of "common" depends on your audience, and language, but it is a terrible disservice to users when you let them choose passwords that exist in the list of 10k, 100k, or million most common known passwords from data breaches. There's no question that a hacker will submit these common passwords in a hack attempt – and it's shocking how far you can get, even with aggressive password attempt rate limiting, using just the 1,000 most common passwords.

  • 1.6% have a password from the top 10 passwords
  • 4.4% have a password from the top 100 passwords
  • 9.7% have a password from the top 500 passwords
  • 13.2% have a password from the top 1,000 passwords
  • 30% have a password from the top 10,000 passwords

Lucky you, there are millions and millions of real breached password lists out there to sift through. It is sort of fun to do data forensics, because these aren't hypothetical synthetic Jack the Ripper password rules some bored programmer dreamed up, these are real passwords used by real users.

Do the research. Collect the data. Protect your users from themselves.

4. Check for basic entropy

No need to get fancy here; pick the measure of entropy that satisfies you deep in the truthiness of your gut. But remember you have to be able to explain it to users when they fail the check, too.

entropy visualized

I had a bit of a sad when I realized that we were perfectly fine with users selecting a 10 character password that was literally "aaaaaaaaaa". In my opinion, the simplest way to do this is to ensure that there are at least (x) unique characters out of (y) total characters. And that's what we do as of the current beta version of Discourse. But I'd love your ideas in the comments, too. The simpler and clearer the better!

5. Check for special case passwords

I'm embarrassed to admit that when building the Discourse login, as I discussed in The God Login, we missed two common cases that you really have to block:

  • password equal to username
  • password equal to email address

I& If you are using Discourse versions earlier than 1.4, I'm so sorry and please upgrade immediately.

Similarly, you might also want to block other special cases like

  • password equal to URL or domain of website
  • password equal to app name

In short, try to think outside the password input box, like a user would.

E Clarification

A few people have interpreted this post as "all the other password rules are bullshit, except these four I will now list." That's not what I'm trying to say here.

The idea is to focus on the one understandable, simple, practical, works-in-real-life-in-every-situation rule: length. Users can enter (almost) anything, in proper Unicode, provided it's long enough. That's the one rule to bind them all that we need to teach users: length!

Items #3 through #5 are more like genie-special-exception checks, a you can't wish for infinite wishes kind of thing. It doesn't need to be discussed up front because it should be really rare. Yes, you must stop users from having comically bad passwords that equal their username, or aaaaaaaaaaa or 0123456789, but only as post-entry checks, not as rules that need to be explained in advance.

So TL;DR: one rule. Length. Enter whatever you want, just make sure it's long enough to be a reasonable password.

[advertisement] Building out your tech team? Stack Overflow Careers helps you hire from the largest community for programmers on the planet. We built our site with developers like you in mind.










mirPod.com is the best way to tune in to the Web.

Search, discover, enjoy, news, english podcast, radios, webtv, videos. You can find content from the World & USA & UK. Make your own content and share it with your friends.


HOME add podcastADD PODCAST FORUM By Jordi Mir & mirPod since April 2005....
ABOUT US SUPPORT MIRPOD TERMS OF USE BLOG OnlyFamousPeople MIRTWITTER