Matching a valid date

We will create a regex to validate a date pattern of yyyy-mm-dd, yyyy/mm/dd, or yyyy.mm.dd. At first, the regex will look daunting, but bear with me. When you have completed the code and run the application, we will dissect the regex. Hopefully, the expression logic will become clear.

Getting ready

Ensure that you have added the correct assembly to your class. At the top of your code file, add the following line of code if you haven't already done so:

using System.Text.RegularExpressions;

How to do it…

  1. Create a new method called ValidDate() that takes a string as the parameter. This string will be the date pattern we want to validate:
    public void ValidDate(string stringToMatch)
    {
    
    }
  2. Add the following regex pattern to your method, to a variable in the method:
    string pattern = $@"^(19|20)dd[-./](0[1-9]|1[0-2])[- ./](0[1-9]|[12][0-9]|3[01])$";
  3. Finally, add the regex to match the supplied string parameter:
    if (Regex.IsMatch(stringToMatch, pattern))
        Console.WriteLine($"The string {stringToMatch} contains a valid date.");
    else
        Console.WriteLine($"The string {stringToMatch} DOES NOT contain a valid date.");
  4. When you have done this, your method should look like this:
    public void ValidDate(string stringToMatch)
    {
        string pattern = $@"^(19|20)dd[-./](0[1-9]|1[0-2])[- ./](0[1-9]|[12][0-9]|3[01])$";
    
        if (Regex.IsMatch(stringToMatch, pattern))
            Console.WriteLine($"The string {stringToMatch} contains a valid date.");
        else
            Console.WriteLine($"The string {stringToMatch} DOES NOT contain a valid date.");            
    }
  5. Going back to your console application, add the following code and debug your application by clicking on Start:
    Chapter9.Recipes oRecipe = new Chapter9.Recipes();
    oRecipe.ValidDate("1912-12-31");
    oRecipe.ValidDate("2018-01-01");
    oRecipe.ValidDate("1800-01-21");
                oRecipe.ValidDate($"{DateTime.Now.Year}.{DateTime.Now.Month }.{DateTime.Now.Day}");
    oRecipe.ValidDate("2016-21-12"); 
    Read();

    Note

    You will notice that Read() is used in the preceding code example instead of Console.Read(). This is because using static System.Console; is added to the console application's using statements. Doing this will allow you to omit the Console keyword.

  6. The date strings are passed to the regex, and the pattern is matched against the date string in the parameter. The output is displayed in the console application:
    How to do it…
  7. If you look at the output carefully, you will notice that there is a mistake. We are validating the date string in the format yyyy-mm-dd, yyyy/mm/dd, and yyyy.mm.dd. If we use this logic, our regex has incorrectly flagged a valid date as invalid. This is the date 2016.4.10, which is 10 April, 2016, and is in fact quite valid.

    Note

    We will explain shortly why the date 1800-01-21 is invalid.

  8. Go back to your ValidDate() method and change the regular expression to read as follows:
    string pattern = $@"^(19|20)dd[-./](0[1-9]|1[0-2]|[1- 9])[-./](0[1-9]|[12][0-9]|3[01])$";
  9. Run the console application again and look at the output:
    How to do it…

This time the regex worked for all the given date strings. But what exactly did we do? This is how it works.

How it works…

Let's take a closer look at the two expressions used in the previous code example. Comparing them with each other, you can see the change we made in yellow:

How it works…

Before we get to what that change means, let's break up the expression and view the individual components. Our regex is basically saying that we must match all string dates that start with 19 or 20 and have the following separators:

  • Dash (-)
  • Decimal (.)
  • Forward slash (/)

To understand the expression better, we need to understand the following format of the expression <Valid Years><Valid Separators><Valid Months><Valid Separators><Valid Days>.

We also need to be able to tell the regex engine to consider one OR another pattern. The word OR is symbolised by the | metacharacter. To make the regex engine consider the word OR without splitting up the whole expression, we wrap it in parenthesis ().

Here are the symbols used in the regex:

The conditional OR

|

This denotes the OR metacharacter.

The year portion

(19|20)

Only allow 19 or 20.

dd

Match two single digits between 0 and 9. To match only one digit between 0 and 9, you would use d.

The valid separator character set

[-./]

Match any of the following characters in the character set. These are our valid separators. To match a space date separator, you would change this to [- ./], where you add a space anywhere in the character set. We added the space between the dash and the decimal.

Valid digits for months and days

0[1-9]

Match any part starting with zero followed by any digit between 1 and 9. This will match 01, 02, 03, 04, 05, 06, 07, 08, and 09.

1[0-2]

Match any part starting with 1 followed by any digit between 0 and 2. This will match 10, 11, or 12.

[1-9]

Match any digit between 1 and 9.

[12][0-9]

Match any part starting with 1 or 2, followed by any digit between 0 and 9. This will match all number strings between 10 and 29.

3[01]

Match any part starting with 3 and followed by 0 or 1. This will match 30 or 31.

Start and end of string

^

Tells the regex engine to start at the beginning of the given string to match.

$

Tells the regex engine to stop at the end of the given string to match.

The first regex we created, interprets as follows:

  • ^: Start at the beginning of the string to match
  • (19|20): Check whether the string starts with 19 or 20
  • dd: After the check, follows two single digits between 0 and 9
  • [-./]: The year portion ends followed by a date separator
  • (0[1-9]|1[0-2]): Find the month logic by looking for digits starting with 0, followed by any digit between 1 and 9, or digits starting with 1, followed by any digit between 0 and 2
  • [-./]: The month logic ends, followed by a date separator
  • (0[1-9]|[12][0-9]|3[01]): Then, find the day logic by looking for digits starting with 0, followed by a digit between 1 and 9, or digits starting with 1 or 2, followed by any digit between 0 and 9, or a digit matching 3, followed by any digit between 0 and 1
  • $: Do this until the end of the string

Our first regex was incorrect because our month logic was incorrect. Our month logic dictates to find the month logic by looking for digits starting with a 0 followed by any digit between 1 and 9, or digits starting with a 1 followed by any digit between 0 and 2 (0[1-9]|1[0-2]).

This will then find 01, 02, 03, 04, 05, 06, 07, 08, 09 or 10, 11, 12. The date that it didn't match was 2016.4.10 (the date separators don't make a difference here). This is because our month came through as a single digit, and we were looking for months where the single digits started with a zero. To fix this, we had to modify the expression of the month logic to include single digits between 1 and 9. We did this by adding [1-9] to the expression at the end.

The modified regex then reads as follows:

  • ^: Start at the beginning of the string to match.
  • (19|20): Check whether the string starts with 19 or 20.
  • dd: After the check, follows two single digits between 0 and 9.
  • [-./]: The year portion ends, followed by a date separator
  • (0[1-9]|1[0-2]): Find the month logic by looking for digits starting with 0, followed by any digit between 1 and 9, or digits starting with 1, followed by any digit between 0 and 2, or any single digits between 1 and 9
  • [-./]: The month logic ends, followed by a date separator
  • (0[1-9]|[12][0-9]|3[01]): Then, find the day logic by looking for digits starting with 0, followed by a digit between 1 and 9, or digits starting with 1 or 2, followed by any digit between 0 and 9, or a digit matching 3, followed by any digit between 0 and 1
  • $: Do this until the end of the string

This is a basic regex, and we say basic because there is a lot more we can do to make the expression better. We can include logic to consider alternative date formats such as mm-dd-yyyy or dd-mm-yyyy. We can add logic to check February and validate that it contains only 28 days, unless it is a leap year, in which case we need to allow the twenty-ninth day of February. Furthermore, we can also extend the regex to check that January, March, May, July, August, October, and December have 31 days while April, June, September, and November contain only 30 days.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.57.126