print all html elements of a web page that contain a phrase in csharp

One possible way to print all HTML elements of a web page that contain a phrase in C# is by using the HtmlAgilityPack library to perform web scraping.

First, we need to fetch the HTML content of the web page using the WebRequest class from the System.Net namespace. We can then parse the HTML content using the HtmlDocument class from the HtmlAgilityPack library.

Once we have the parsed HTML document, we can use a regular expression to check if the phrase we are looking for exists in the text of each HTML element. We can then print the outer HTML of any element that matches the regular expression.

Here's some example code:

main.cs
using System;
using System.Net;
using HtmlAgilityPack;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string url = "https://example.com";
        string phrase = "hello world";

        // Fetch the HTML content of the web page
        string html;
        using (var client = new WebClient())
        {
            html = client.DownloadString(url);
        }

        // Parse the HTML content using HtmlAgilityPack
        var doc = new HtmlDocument();
        doc.LoadHtml(html);

        // Use a regular expression to search for the phrase in each HTML element
        var regex = new Regex(Regex.Escape(phrase), RegexOptions.IgnoreCase);
        foreach (var node in doc.DocumentNode.DescendantsAndSelf())
        {
            if (node.NodeType == HtmlNodeType.Element && regex.IsMatch(node.InnerText))
            {
                Console.WriteLine(node.OuterHtml);
            }
        }
    }
}
967 chars
35 lines

This code should print the outer HTML of each HTML element that contains the specified phrase on the web page at the given URL.

gistlibby LogSnag