Source-generated RegEx (C#)

Introduction

Learn about source generated (.NET 7 and higher) regular expressions to improve performance and documentation.

Rather than a long article with benchmarks, this article is short without benchmarks as for learning there is plent…


This content originally appeared on DEV Community and was authored by Karen Payne

Introduction

Learn about source generated (.NET 7 and higher) regular expressions to improve performance and documentation.

  • Rather than a long article with benchmarks, this article is short without benchmarks as for learning there is plenty of information provided along with useful source code.

  • Creation is a manual process while those with Jetbrains ReSharper installed will assist with the creation process and even make useful name recommendations.

Source code (NET8)

The source project contains two classes with several helpful example to learn from.

  • Credit card masking.
  • Remove extra spaces from strings.
  • Conditional extracting parts from hyper links.
  • Uncommon date extraction from a string with multiple dates.
  • Method to increment alpha numeric strings.

Performance

Rather repeat a great resource, see Regular Expression Improvements in .NET 7 by Stephen Toub - MSFT.

Implementation

To use source generated regular expressions GeneratedRegex attribute, create a static partial class.

Simple example which proper cases a string.

using System.Text.RegularExpressions;

namespace GeneratedRegexSamplesApp.Classes;
public static partial class Helpers
{
     public static string ProperCased(this string source)
        => SentenceCaseRegex()
            .Replace(source.ToLower(), s => s.Value.ToUpper());


     [GeneratedRegex(@"(^[a-z])|\.\s+(.)", RegexOptions.ExplicitCapture)]
    private static partial Regex SentenceCaseRegex();
}

After adding the class to a project, SentenceCaseRegex() will have red squiggly below until the project is built.

Once the project is built, the source code can be viewed under Dependences ➡️ Analyzers.

Shows RegexGenator.g.cs

Documentation

Source generation has a bonus, documentation of the regular expression pattern which is helpful in two ways. First, if a developer did not write the expression pattern the XML documentation helps to explain the pattern and secondly when the expression is in a library helps developer to know if the method using a specific pattern fits their needs.

To see the documentation, hover over the implementation or the method as shown below.

Shows the explanation for the regular expression

Important even though the documentation is provided does not mean there is no need for documentation of the method using the regular expression.

Perfect example, a method to determine if a social security number is valid were the social security number is passed with dashes.

SSN validation

Hover over the method provides the following which is correct but does not explain the why.

Shows XML documentation

In this case the developer needs to explain the why as shown below to prevent fraud.

/// <summary>
/// Is a valid SSN
/// </summary>
/// <returns>True if valid, false if invalid SSN</returns>
/// <remarks>
/// 
/// Guaranteed to never be an empty string or null, client code handles this. 
/// 
/// ^                                       #Start of expression
/// (?!\b(\d)\1+-(\d)\1+-(\d)\1+\b)         #Don't allow all matching digits for every field
/// (?!123-45-6789|219-09-9999|078-05-1120) #Don't allow "123-45-6789", "219-09-9999" or "078-05-1120"
/// (?!666|000|9\d{2})\d{3}                 #Don't allow the SSN to begin with 666, 000 or anything between 900-999
/// -                                       #A dash (separating Area and Group numbers)
/// (?!00)\d{2}                             #Don't allow the Group Number to be "00"
/// -                                       #Another dash (separating Group and Serial numbers)
/// (?!0{4})\d{4}                           #Don't allow last four digits to be "0000"
/// $                                       #End of expression
/// </remarks>
public static bool IsValidSocialSecurityNumber(string value) => SSNValidationRegex().IsMatch(value.Replace("-", ""));

Summary

Information and code samples have been provided to show how to implement source generation for RegEx (Regular Expressions) which provide better performance gains than conventional implementation of regular expressions with the bonus of XML documentation.


This content originally appeared on DEV Community and was authored by Karen Payne


Print Share Comment Cite Upload Translate Updates
APA

Karen Payne | Sciencx (2024-08-24T17:12:32+00:00) Source-generated RegEx (C#). Retrieved from https://www.scien.cx/2024/08/24/source-generated-regex-c/

MLA
" » Source-generated RegEx (C#)." Karen Payne | Sciencx - Saturday August 24, 2024, https://www.scien.cx/2024/08/24/source-generated-regex-c/
HARVARD
Karen Payne | Sciencx Saturday August 24, 2024 » Source-generated RegEx (C#)., viewed ,<https://www.scien.cx/2024/08/24/source-generated-regex-c/>
VANCOUVER
Karen Payne | Sciencx - » Source-generated RegEx (C#). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/08/24/source-generated-regex-c/
CHICAGO
" » Source-generated RegEx (C#)." Karen Payne | Sciencx - Accessed . https://www.scien.cx/2024/08/24/source-generated-regex-c/
IEEE
" » Source-generated RegEx (C#)." Karen Payne | Sciencx [Online]. Available: https://www.scien.cx/2024/08/24/source-generated-regex-c/. [Accessed: ]
rf:citation
» Source-generated RegEx (C#) | Karen Payne | Sciencx | https://www.scien.cx/2024/08/24/source-generated-regex-c/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.