posted 05/27/2008 by James

Regular expressions can help clean up input text for more aesthetic purposes. Let's assume we want to convert a blog title into URL-friendly characters.

 

//the title that we are going to turn into a url

$title = 'php help: regular expression';

 

Filter out all unwanted text first for compatability and security reasons, as shown here. But unlike those examples we can't just delete all invalid characters or we would end up with all the alphanumeric characters from our original title smashed together. Since we're going to have all the words separated by a hyphen we modify the replacement text from the empty string to a hyphen.

 

$almost_url = preg_replace('/[^a-z0-9.]/', '-' , $title);

//yields 'php-help----regular-expression';

 

Finally, lets clean up the text so it only contains one hypen between each word. This time we want to modify all text that fits a certain pattern so we have to remove the negation character (^) inside brackets

 

$url =  preg_replace('/-[-]*/', '-',$almost_url);

//yields php-help-regular-expression

 

The graphic below shows how this works. To get out of the start state and into the end state (where it will accept the pattern) we need atleast one hyphen. Once we have one hyphen it keeps looping back into the end state until it reads in a different character. Then it quits replaces all those hyphens with a single one, and begins the process all over again until it reaches the end of the string.

Share:
facebook myspace digg del.icio.us fark stumbleupon live spurl furl reddit yahoo

COMMENTS (displaying 0 comments)

POST (leave a comment)

Name:
Email:
Message:
Verify:
CAPTCHA Image