make a scatterplot in matplotlib, allow the user to make a lasso selection, save the selection as a dataframe. in python

To create a scatterplot with lasso selection in matplotlib, you can first import the necessary libraries:

main.py
import matplotlib.pyplot as plt
from matplotlib.widgets import LassoSelector
from matplotlib import path
import pandas as pd
125 chars
5 lines

Next, you can create your scatterplot using plt.scatter:

main.py
fig, ax = plt.subplots()
x_values = [1, 3, 5, 7, 9]
y_values = [2, 4, 6, 8, 10]
scatter = ax.scatter(x_values, y_values)
121 chars
5 lines

To allow the user to make a lasso selection on the scatterplot, you can create a LassoSelector object and pass it the scatterplot:

main.py
lasso = LassoSelector(ax, onselect=None)

def on_select(verts):
    patch = path.Path(verts)
    selected_points = scatter.get_offsets()[patch.contains_points(scatter.get_offsets())]
    df = pd.DataFrame(selected_points, columns=['x', 'y'])
    print(df)

lasso.on_select = on_select
285 chars
10 lines

In the on_select function, we first create a path object from the selected vertices using path.Path(verts). Then, we get the offsets of the scatterplot (i.e. the x and y values) using scatter.get_offsets(). We use the contains_points method of the path object to find the points in the scatterplot that fall within the lasso selection. Finally, we create a pandas DataFrame from the selected points and print it to the console.

Note that you can replace the print statement with any code that processes the selected data.

Here's the full code:

main.py
import matplotlib.pyplot as plt
from matplotlib.widgets import LassoSelector
from matplotlib import path
import pandas as pd

fig, ax = plt.subplots()
x_values = [1, 3, 5, 7, 9]
y_values = [2, 4, 6, 8, 10]
scatter = ax.scatter(x_values, y_values)

lasso = LassoSelector(ax, onselect=None)

def on_select(verts):
    patch = path.Path(verts)
    selected_points = scatter.get_offsets()[patch.contains_points(scatter.get_offsets())]
    df = pd.DataFrame(selected_points, columns=['x', 'y'])
    print(df)

lasso.on_select = on_select

plt.show()
545 chars
22 lines

gistlibby LogSnag