select dataframe by columns in python

To select a subset of columns from a dataframe, you can use the indexing operator [] with the columns you want to select. There are several ways to do this:

main.py
# import pandas library
import pandas as pd

# create a sample dataframe
data = {'Name': ['John', 'Emily', 'Kate'], 'Age': [25, 30, 35], 'Gender': ['M', 'F', 'F']}
df = pd.DataFrame(data)

# select single column
df['Name']

# select multiple columns
df[['Name', 'Age']]

# select columns using boolean indexing
df[df['Gender'] == 'F']

# select columns using .loc
df.loc[:, ['Name', 'Gender']]
394 chars
19 lines

In the first example, we select a single column 'Name' from the dataframe df. In the second example, we select two columns 'Name' and 'Age' using a list of column names. In the third example, we select rows where Gender is 'F' and all columns are returned. Finally, we use .loc to select all rows and the columns 'Name' and 'Gender'.

gistlibby LogSnag