21 May 2006
RegEx for URI (non IP) (website) matching

I’ve been trying to find a relatively good regular expression, but there wasn’t one that did the trick for me.

My 2 hours work led me here (join lines before use):

    • *Let’s take it step by step:</p>

$1 - gives HTTP’s security
$2 - gives “www.” or null

$3 - gives the domain name, sometimes without “www.”

((?:/(?:(?:(?:(?:w|-|_)+.)*<br /> (?:w|-|_)+/?)*))*)
$4 - gives the path & filename

$5 - gives parameters, preceeded by “?”


  • Yes, I know. It only works for HTTP. Can be easily changed, but I needed it just like this, in order to replace it with a HTML link. You change it to your needs.
  • Domain name tries to be restrictive: the last domain part can be between 2 and 5 characters of length (“museum” is the longest one)
  • The address cannot end in a dot “.” unless it is the address for a script that takes parameters
  • Parameters can only be “restricted” to have only non-spaced characters
  • The address starts with the “http://” or with “www.” . Any other address does not match.

