regex for html attribute value in python

To regex for an HTML attribute value in Python, you can use the re module.

Here's an example code snippet:

main.py
import re

# Sample HTML string
html = '<a href="https://www.example.com/">Example</a>'

# Regex pattern to get attribute value for href
pattern = 'href="([^"]*)"'

# Use re.search to find the pattern in the HTML string
match = re.search(pattern, html)

# The matched string is available in group(1)
if match:
    href_value = match.group(1)
    print(href_value)
364 chars
16 lines

Output: https://www.example.com/

In the pattern variable, we are regexing for the string href=" followed by any number of characters that are not " and capturing them in a group using (). The final " marks the end of the attribute value. We use re.search to find this pattern in the HTML string and then retrieve the matched value from group(1).

gistlibby LogSnag