pandas read csv error unicodedecodeerror: 'utf-8' codec can't decode byte 0xc7 in position 10: invalid continuation byte in matlab

When encountering an error like UnicodeDecodeError: 'utf-8' codec can't decode byte ... while reading a csv with pandas, it means pandas is having trouble reading the file's encoding format. One way to solve this would be to specify the encoding while reading the csv file, like so:

main.py
import pandas as pd

df = pd.read_csv('file.csv', encoding='latin1')
69 chars
4 lines

In the above example, we've specified latin1 as the encoding format for the csv file. You can try different encoding formats, depending on how the file was created or the format it is in. Popular encoding formats are utf-8 and cp1252 among others.

If you're still having trouble reading the file, it may be best to check the file's metadata or contact the original creator to verify the encoding format.

related categories

gistlibby LogSnag