Dynamic regex matching

What does dynamic regex matching even mean? Well, it isn't an official term, but it is a term we use to explain a Regex that uses variables at runtime to generate a specific expression. Assume for a minute that you are working on a document-management system that needs to implement versioning of documents for a company called Acme Corporation. To do this, the system validates that the document has a valid file name.

A business rule states that the file name of any file uploaded on a specific day must be prefixed with acm (for Acme) and today's date in the yyyy-mm-dd format. There can be only text files, Word documents (only .docx), and Excel documents (only .xlsx). Any documents not conforming to this file format are processed by another method that takes care of archive and invalid documents.

The only task that your method needs to perform is to process fresh documents as version one documents.

Note

In a production system, further logic will probably be needed to determine whether the same document has been uploaded previously on the same day. This, however, is beyond the scope of this chapter. We are just trying to set the scene.

Getting ready

Ensure that you have added the correct assembly to your class. At the top of your code file, add the following line of code if you haven't already done so:

using System.Text.RegularExpressions;

How to do it…

  1. A really nice way to do this is to use an extension method. This way, you can call the extension method directly on the file name variable and have it validated. In your Recipes.cs file, start off by adding a new class called CustomRegexHelper with the public static modifier:
    public static class CustomRegexHelper
    {
        
    
    }
  2. Add the usual extension method code to the CustomRegexHelper class and call the ValidAcmeCompanyFilename method:
    public static bool ValidAcmeCompanyFilename(this String value)
    {
            
    }
  3. Inside your ValidAcmeCompanyFilename method, add the following regex. We will explain the makeup of this regex in the How it works… section of this recipe:
    return Regex.IsMatch(value, $@"^acm[_]{DateTime.Now.Year}[_]({DateTime.Now.Month}|0[{Da teTime.Now.Month}])[_]({DateTime.Now.Day}|0[{DateTime.Now.D ay}])(.txt|.docx|.xlsx)$");
  4. When you have completed this, your extension method should look like this:
    public static class CustomRegexHelper
    {
        public static bool ValidAcmeCompanyFilename(this String value)
        {
            return Regex.IsMatch(value, $@"^acm[_]{DateTime.Now.Year}[_] ({DateTime.Now.Month}|0[{DateTime.Now.Month}]) [_]({DateTime.Now.Day}|0[{DateTime.Now.Day}]) (.txt|.docx|.xlsx)$");
        }
    }
  5. Back in the Recipes class, create a method with the void return type called DemoExtendionMethod():
    public void DemoExtendionMethod()
    {
        
    }
  6. Add some output text to show the current date and the valid file name types:
    Console.WriteLine($"Today's date is: {DateTime.Now.Year}- {DateTime.Now.Month}-{DateTime.Now.Day}");
    Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month}_{DateTime.Now. Day}.txt including leading month and day zeros");
    Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month}_{DateTime.Now. Day}.docx including leading month and day zeros");
    Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month}_{DateTime.Now. Day}.xlsx including leading month and day zeros");
  7. Then, add the file name checking code:
    string filename = "acm_2016_04_10.txt";
    if (filename.ValidAcmeCompanyFilename())
        Console.WriteLine($"{filename} is a valid file name");
    else
        Console.WriteLine($"{filename} is not a valid file name");
    
    filename = "acm-2016_04_10.txt";
    if (filename.ValidAcmeCompanyFilename())
        Console.WriteLine($"{filename} is a valid file name");
    else
        Console.WriteLine($"{filename} is not a valid file name");
  8. You will notice that the if statement contains the call to the extension method on the variable that contains the file name:
    filename.ValidAcmeCompanyFilename()
  9. If you have completed this, your method should look like this:
    public void DemoExtendionMethod()
    {
        Console.WriteLine($"Today's date is: {DateTime.Now.Year}-{DateTime.Now.Month}- {DateTime.Now.Day}");
        Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month}_ {DateTime.Now.Day}.txt including leading month and day zeros");
        Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month}_ {DateTime.Now.Day}.docx including leading month and day zeros");
        Console.WriteLine($"The file must match: acm_{DateTime.Now.Year}_{DateTime.Now.Month} _{DateTime.Now.Day}.xlsx including leading month and day zeros");
    
        string filename = "acm_2016_04_10.txt";
        if (filename.ValidAcmeCompanyFilename())
            Console.WriteLine($"{filename} is a valid file name");
        else
            Console.WriteLine($"{filename} is not a valid file name");
    
        filename = "acm-2016_04_10.txt";
        if (filename.ValidAcmeCompanyFilename())
            Console.WriteLine($"{filename} is a valid file name");
        else
            Console.WriteLine($"{filename} is not a valid file name");
    }
  10. Going back to the console application, add the following code that simply just calls the void method. This is just to simulate the versioning method talked about earlier:
    Chapter9.Recipes oRecipe = new Chapter9.Recipes();
    oRecipe.DemoExtendionMethod();
    Read();
  11. When you are done, run your console application:
    How to do it…

How it works…

Let's have a closer look at the regex generated. The line of code we are looking at is the return statement in the extension method:

return Regex.IsMatch(value, $@"^acm[_]{DateTime.Now.Year}[_]({DateTime.Now.Month}|0[{DateTime. Now.Month}])[_]({DateTime.Now.Day}|0[{DateTime.Now.Day}])(.txt|.do cx|.xlsx)$");

To appreciate what is happening, we need to break this expression up into the different components:

The conditional OR

|

This denotes the OR metacharacter.

The file prefix and separator

acm

The file must begin with the text acm.

[_]

The only valid separator between the date components and the prefix in the file name is an underscore.

The date parts

{DateTime.Now.Year}

The interpolated year part of the date for the file name.

{DateTime.Now.Month}

The interpolated month part of the date for the file name.

0[{DateTime.Now.Month}]

The interpolated month part of the date with a leading zero for the file name.

{DateTime.Now.Day}

The interpolated day part of the date for the file name.

0[{DateTime.Now.Day}]

The interpolated day part of the date with a leading zero for the file name.

Valid file formats

(.txt|.docx|.xlsx)

Match any of these file extensions for text documents, Word documents, or Excel documents.

Start and end of string

^

Tells the regex engine to start at the beginning of the given string to match.

$

Tells the regex engine to stop at the end of the given string to match.

Creating the regex in this manner allows us to always have it stay up to date. As we have to always match the current date to the file being validated, this creates a unique challenge that is easily overcome using string interpolation, DateTime, and regex OR statements.

Having a look at some of the more useful bits of regex, you will see that this chapter has not even begun to scratch the surface of what can be accomplished. There is a whole lot more to explore and learn. There are many resources on the Internet, as well as some free (some online) and commercial tools that will assist you in creating regex.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.55.193