Wednesday, July 8, 2009

Parse the Domain from a URL String with a C# Extension Method

Many times I’ll need to parse the domain name, including the http[s]:// from a URL.  Yet there’s no straightforward way to get it using the Uri class.  This is especially useful when writing custom webparts for SharePoint.  I like to avoid using the Uri class anyway because it tends to just be a headache to use (it doesn’t serialize, you need to check for a null or empty string, object overhead, and what’s-the-point-anyway).  I wrote two C# extension methods that parse a domain from a URL string or a Uri object using Regex.

Usage

"http://bing.com/hello".AsDomain();      // => "http://bing.com"
"https://bing.com/hello".AsDomain();     // => "https://bing.com"
"http://bing.com:1234/hello".AsDomain(); // => "http://bing.com:1234"
"/hello".AsDomain();                     // => "/hello" 

I like to write methods that are forgiving, in that they don’t complain by throwing exceptions when inputs aren’t quite right.  You can use the method on an empty string, a null string, a relative domain, or even a string that’s not even a domain.  In those cases, the method just returns the input string.  This way it’s very easy to use and you don’t need to check for IsNullOrEmpty() every time.

Parse URL from a string, C# extension method

using System.Text.RegularExpressions;
namespace System
{
    public static class StringExtensions
    {
        /// <summary>
        /// Parses the domain from a URL string or returns the string if no URL was found
        /// </summary>
        /// <param name="url"></param>
        /// <returns></returns>
        public static string AsDomain(this string url)
        {
            if (string.IsNullOrEmpty(url))
                return url;

            var match = Regex.Match(url, @"^http[s]?[:/]+[^/]+");
            if (match.Success)
                return match.Captures[0].Value;
            else
                return url;
        }

        /// <summary>
        /// Parses the domain from a URL
        /// </summary>
        /// <param name="url"></param>
        /// <returns></returns>
        public static string AsDomain(this Uri url)
        {
            if (url == null)
                return null;

            return url.ToString().AsDomain();
        }
    }
}

Complete with Unit Tests

Just so you know it’s been tested at least a small bit.  I combine most tests into a single method because I’m lazy.  Deal with it.

using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace StringExtensionsTests
{
    /// <summary>
    ///This is a test class for StringExtensionsTest and is intended
    ///to contain all StringExtensionsTest Unit Tests
    ///</summary>
    [TestClass()]
    public class StringExtensionsTest
    {
        private TestContext testContextInstance;

        /// <summary>
        ///Gets or sets the test context which provides
        ///information about and functionality for the current test run.
        ///</summary>
        public TestContext TestContext
        {
            get
            {
                return testContextInstance;
            }
            set
            {
                testContextInstance = value;
            }
        }

        /// <summary>
        ///A test for AsDomain
        ///</summary>
        [TestMethod()]
        public void AsDomainTest1()
        {
            Assert.IsNull(((string)null).AsDomain());
            Assert.AreEqual(string.Empty, string.Empty.AsDomain());
            Assert.AreEqual("http://www.bing.com", "http://www.bing.com/hello".AsDomain());
            Assert.AreEqual("http://localhost:1234", "http://localhost:1234/hello".AsDomain());
            Assert.AreEqual("http://www.bing.com", "http://www.bing.com".AsDomain());
            Assert.AreEqual("https://www.bing.com", "https://www.bing.com/hello".AsDomain());

            Assert.AreEqual("/relative", "/relative".AsDomain());
        }

        /// <summary>
        ///A test for AsDomain
        ///</summary>
        [TestMethod()]
        public void AsDomainTest()
        {
            Assert.AreEqual("http://bing.com", new Uri("http://bing.com/hello").AsDomain());
        }
    }
}

I used this amazing online app called RegExr to build the regular expression used here.  It’s a great replacement for Expresso (my trial ran out!).  Check it out, it’s awesome.

3 comments:

  1. How does this perform compared to using the native System.Uri object?

    Personally, I find this more maintainable than RX strings:

    // Authority returns everything up-to-and-including the port (if specified)
    public static string AsDomain(this string url) {
    if(string.IsNullOrEmpty(url) || string.StartsWith("/")) return url;

    Uri uri;
    if(!Uri.TryCreate(url, UriKind.Absolute, out uri)) return url;

    return uri.GetLeftPart(UriPartial.Authority);
    }

    (Additionally, this doesn't limit you to just the HTTP protocol - ftp://, file://, mailto:, etc are supported by Uri)

    ReplyDelete
  2. That's exactly what I was going to ask. As a general rule, I prefer to use built-in functionality of the .NET Framework over rolling my own -- unless there is a huge performance increase by writing it myself.

    ReplyDelete
  3. What about new Uri( url ).Host? Could you comment on differences?

    ReplyDelete