• Top
  • Comment
  • Reply

Extracting Twitter Usertags using Regex

Take the following tweet as an example

@shahmirj can be found on hello@example.com and @r2d2 but @000 dosent exist as a user

We need to be sure to match only @shahmirj and @r2d2 and leave any thing that starts with a number or is an email addresses. To do this we use the following regex:


The best way to understand the regex above is to start at the right of @, lets understand the meaning of the following @([A-Za-z]+[A-Za-z0-9]+)

We have to make sure that any thing we match starts with characters hence the [A-Za-z]+ but which can be followed by any numbers, therefore the followed expression [A-Za-z0-9]+. This will make sure to match user-names such as @r2d2. We cant leave things there because if you run this regex as it is, you will end up catching @example which is not what we want. This is where the part previous to @ sign comes in.

Lets break (?<=^|(?<=[^a-zA-Z0-9-_\.])) down. If we look at the inner right side of the bracket (?<=[^a-zA-Z0-9-_\.]) which makes sure that we don't catch any characters before the @ sign, So emails or tags such as aaa@bbb are ignored. However if we only use this part then our first expression @shahmirj disappears as it dosent start with any character, Therefore we use the expression before ?<=^ and combine it all together (?<=^|(?<=[^a-zA-Z0-9-_\.])) which in plain English translates to match anything which either at the start or starts with a space.

We can now combine this into our PHP and change the twitter text to highlight the user-names and convert them into <a> tags. An example of this can be seen at http://www.shahmirj.com/twitter

$string = "@shahmirj can be found on hello@example.com and @r2d2 but @000 dosent exist as a user";
$regex = "/(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9]+)/i";

preg_replace($regex, "<a href='http://twitter.com/$1'>@$1</a>", $string);

Just a point of note that this can also be used for Hash Tags, just change the @ symbol to #.


If any one has a better suggestion or something I missed please leave a comment (Now Working!)


Fixed the issue where it wasnt picking up tags such as @shahmirj_, Needed to add _ in the end of the matching group


18th Jun 2011
© 2011 Shahmir Javaid - http://shahmirj.com/blog/17


2nd Feb 2012

Hey man, this is awesome. Wonderful explanation!

BTW your website rocks as well.

Shahmir Javaid

5th Feb 2012

Thanks @Slavisha, The site is simple and no Photoshop used :D


15th Nov 2012

hi Shahmir

also can you add how to use regex for getting twitter username for any url



Shahmir Javaid

15th Nov 2012

It should be easy to create your own, by using the above and prepending some of the example in the following http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url

Mike Stoddart

12th Feb 2014

Any chance you could port this to Python? I can't seem to get the right syntax..

Arash Vahabpour

11th Jan 2016

there are some replies like .@arash, you might consider them as well

Shahmir Javaid

11th Jan 2016

@Mike It is pretty standard, there should be no difference in python. Let me know otherwise

Sreekanth Palaparty

5th Oct 2016

@Shamir look behind is not supported in javascript any alternative?

Shahmir Javaid

5th Oct 2016

@Sreekanth, not sure please post if you find a solution

Back to Top
All content is © copyrighted, unless stated otherwise.
Subscribe, @shahmirj, Shahmir Javaid+