Re: Extracting hostname from URI column
От | Paul Lambert |
---|---|
Тема | Re: Extracting hostname from URI column |
Дата | |
Msg-id | 46EDB43D.9060405@autoledgers.com.au обсуждение исходный текст |
Ответ на | Re: Extracting hostname from URI column (Paul Lambert <paul.lambert@autoledgers.com.au>) |
Ответы |
Re: Extracting hostname from URI column
Re: Extracting hostname from URI column |
Список | pgsql-sql |
Paul Lambert wrote: > chester c young wrote: >>> I'm trying to use substr() and position() functions to extract the >>> full host name (and later a domain) from a column that holds URLs. >> >> substring( href from '.*://\([^/]*)' ); >> > > Ok, your solution looks better than mine... but I have no idea how to > interpret that, time to consult some manuals. > OK - following on from this, I'm doing a similar thing to the OP to analyze my proxy's traffic (never occured to me to do it in a db until that post) I've done the above regex to pull out the domain part of the URL and am left with results such as: "acvs.mediaonenetwork.net" "profile.ak.facebook.com" "www.bankwest.com.au" What I want to do next is pull out the major domain part of the URL I.e. for the above three records I should end up with "mediaonenetwork.net" "facebook.com" "bankwest.com.au" What would be the best way to do something like that? I assume it won't be a simple regex like the above due to the country codes on the end of some domains. My thought is look at the last portion of the domain, if it's 2 characters long then assume it's a country code and grab the last three sections, if it's not three characters long then assume it's an international domain and grab the last two... but that sounds a bit dodgy. -- Paul Lambert Database Administrator AutoLedgers
В списке pgsql-sql по дате отправления: