A while back, when I was writing Hummingbird, I needed to look for Twitter usernames in various strings. More recently, I’m doing some work that involves Twitter at my new job. Once again, I need to find and match on Twitter usernames.
Luckily, this time, Twitter seems to have updated its signup page with some nice AJAX that constrains the user’s options, and provides helpful feedback. So, for anyone else who needs this information in the future, here’s the scoop:
- Letters, numbers, and underscores only. It’s case-blind, so you can enter hi_there,Hi_There, orHI_THEREand they’ll all work the same (and be treated as a single account).
- There is apparently no minimum-length requirement; the user a exists on Twitter. Maximum length is 15 characters.
- There is also no requirement that the name contain letters at all; the user 69 exists, as does a user whose name I can’t pronounce.
If you want a regex to match on this, /[a-zA-Z0-9_]{1,15}/ would be nice and safe for use in both POSIX and Perl-style regex syntax. (If you’ve got Perl-compatible regexes, /\w{1,15}/ is quick and easy. Update: And it’s wrong; see comment 6 below, by Mark Fowler.)
7 Comments
Hi Kai,
Thanks for an excellent example of why I’ve always thought of search as a programmer’s most powerful tool 🙂
And after wrestling with trying to create the perfect valid email regex, it’s refreshing to have something as simple as \w{1,15}
— Ken
Thanks. I had this in a script somewhere but often searching Google is faster 😉
Thanks for this. I’ve slightly changed it so it works slightly better in PHP (and presumably in other languages).
Full regex – /^[a-zA-Z0-9_]{1,15}$/
Perl-compatible regex – /^\w{1,15}$/
The only difference is that I’ve started it with ^ (string start delimeter) and ended with $ (string end delimiter). Ensures that you don’t get any illegal characters either side of the valid characters.
Great! Just what I needed!
This is a really helpful article. Just what i was looking for. You saved me a lot of research 😉
The Perl compatible regexp you listed won’t actually do the right thing in Perl. “\w+” will match e-acute and other accented characters (unless you use the a modifier from 5.14 and later.) You’re probably just better writing [A-Za-z0-9] if that’s what you really mean.
Woah… I did not know that. Thank you!
Yes, it sounds like [A-Z-a-z0-9] would be better. I’ll update the post to mention that.