Python’s urlparse module breaks down URLs in to components. It supports all the URL schemes specified in RFC 3986 1.

>>> import urlparse
>>> urlparse.urlparse("")
ParseResult(scheme='https', netloc='', path='/foo', params='', query='param=bar', fragment='')
>>> urlparse.urlparse("file://")
ParseResult(scheme='file', netloc='', path='/etc/fstab', params='', query='', fragment='')
>>> urlparse.urlparse("news:comp.infosystems.www.misc")
ParseResult(scheme='news', netloc='', path='comp.infosystems.www.misc', params='', query='', fragment='')

If you have questions or comments about this blog post, you can get in touch with me on Twitter @sdqali.

If you liked this post, you'll also like...