• Top
  • Comment
  • Reply

Slugging Titles using Regex and PHP

Slugs are good to use with blog posts, so we can replace boaring URL such as http://shahmirj.com/blog/27 with http://shahmirj.com/blog/generating-title-slugs. These slugs also give you an advantage in SEO, letting the search crawlers categorise your URL better. So lets get straight to it and look at our function:

function slug($value)
{  
    $value = str_replace("@", ' at ', $value);
    $value = str_replace("&", ' and ', $value);
    $value = str_replace("£", ' pound ', $value);
    $value = str_replace("#", ' hash ', $value);
    $value = preg_replace("/[\-+]/", ' ', $value);
    $value = preg_replace("/[\s+]/", ' ', $value);
    $value = preg_replace("/[\.+]/", '.', $value);
    $value = preg_replace("/[^A-Za-z0-9\.\s]/", '', $value);
    $value = preg_replace("/[\s]/", '-', $value);
    $value = preg_replace("/\-\-+/", '-', $value);

    $value = strtolower($value);

    if (substr($value, -1) == "-") { $value = substr($value, 0, -1); }
    if (substr($value, 0, 1) == "-") { $value = substr($value, 1); }

    return $value;
}

The first three lines replace symbols with words, So slugs such as @-my-friends-house turn into at-my-friends-house. Here we also deal with &, £ and #.

Then we move onto replace multiple dashes into single spaces, so we can then convert multiple spaces into single spaces. In other words a string such as hello--mike will be converted to hello mike(double space) which in turn will convert to hello mike (single space). We can then use the single spaces and convert them back to -. Therefore the final result will be hello-mike.

The next step just converts multiple periods into singular periods which can then be run through the main regex which deals with removing any thing but [A-Za-z0-9\.]. Note I originally removed numbers from my slugs, however imagine a title such as R2D2 & 3CP0 the best robots around! will be converted to r2d2-and-c3p0-the-best-robots-around, which makes far more sense than using rd-and-cp-the-best-robots-around.

The remaining if checks before the return just remove trailing and leading - from a slug. So slugs such as -hello-mike- will be converted to hello-mike.

Note one con with generating slug, is that they need to be unique other wise your website will not work. And therefore always remember to check weather the slug is Unique before inserting into your database.

By

14th May 2012
© 2011 Shahmir Javaid - http://shahmirj.com/blog/27

Marcel

1st Apr 2013

Shahmir,

From a speed standpoint, do you not think it would be wiser to use str_replace for some of your replace calls? Of course, where you need to use regex str_replace will have to be replaced by preg_replace, however you could easily replace your @/#/£/&/etc. symbols using even a single str_replace call (as we can use arrays to loop over our string as the params).

What do you think?

Shahmir Javaid

3rd Apr 2013

Marcel,

You are absolutely correct, it would make things faster. I have edited the post.

Thanks



Back to Top
All content is © copyrighted, unless stated otherwise.
Subscribe, @shahmirj, Shahmir Javaid+