print all html elements of a web page that match a regular expression in csharp

main.cs
using System;
using System.Net;
using HtmlAgilityPack;
using System.Text.RegularExpressions;

class Program {
    static void Main(string[] args) {
        string url = "https://example.com";
        string pattern = "<p>"; // or any other regex pattern you want to match
        HtmlWeb web = new HtmlWeb();
        HtmlDocument doc = web.Load(url);
        foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//body//*[not(self::script)]")) {
            if (Regex.IsMatch(node.OuterHtml, pattern)) {
                Console.WriteLine(node.OuterHtml);
            }
        }
    }
}
590 chars
19 lines

The above code uses the HtmlAgilityPack library to load a web page and parse its HTML. It then iterates over all elements in the body of the page, excluding any script tags. For each element, it checks if its OuterHtml property matches the given regex pattern. If so, it prints the HTML code of the element to the console.

gistlibby LogSnag