To find emojis in a string using regex in Python, you need to use the re
module for regular expression matching and the emoji
module for handling emojis as Unicode characters. Here's an example code snippet that demonstrates how to do it:
main.py173 chars9 lines
Output:
main.py13 chars2 lines
In the above code snippet, we first import the necessary modules - re
and emoji
. We then define a test string containing some emojis.
To find all the emojis in the string, we use the re.findall
method with the Unicode regex pattern \X
. This regex pattern matches any Unicode character, including emojis.
However, this will include some non-emoji Unicode characters that we need to filter out. For that, we create a list comprehension that filters out only those Unicode characters that are present in the emoji.UNICODE_EMOJI['en']
dictionary. This dictionary contains all the official Unicode emojis for the 'en' (English) locale.
Finally, we print the list of emojis found in the string.
Note that this method may not work for all emojis, as some emojis are composed of multiple Unicode characters, and the \X
pattern will only match one character at a time. In such cases, you may need to use more complex regex patterns to match the entire emoji sequence.
gistlibby LogSnag