compare sub strings and find best match from dictionary in python

To compare sub-strings and find the best match from a dictionary in Python, one approach is to use a string similarity metric like Levenshtein distance or Jaccard similarity. Here's an example that uses the Levenshtein distance metric:

main.py
import Levenshtein  # install this package with pip

def find_best_match(substring, dictionary):
    best_match = None
    best_distance = float('inf')
    for word in dictionary:
        distance = Levenshtein.distance(substring, word)
        if distance < best_distance:
            best_distance = distance
            best_match = word
    return best_match
363 chars
12 lines

Here, Levenshtein.distance computes the Levenshtein distance between the substring and each word in the dictionary. The function returns the word with the lowest Levenshtein distance, which is the best match for the substring.

To use this function, simply pass in the substring and the dictionary:

main.py
dictionary = ['apple', 'banana', 'orange', 'peach', 'pear']
substring = 'appl'
best_match = find_best_match(substring, dictionary)
print(best_match)  # output: 'apple'
168 chars
5 lines

In this example, the substring is 'appl', and the best match in the dictionary is 'apple' (which has a Levenshtein distance of 1 from the substring).

gistlibby LogSnag