It's a pretty common need, a scatter plot of data points and generating a 'best fit line' .
#!/usr/bin/python
import random;
import matplotlib.pyplot as plt
import numpy as np;
def run():
L=[];
for x in range(0,500):
y=random.randint(0,200) + x*3;
L.append((x,y));
out = [(float(x), float(y)) for x, y in L];
for i in out:
plt.scatter(i[0],i[1]);
plt.xlabel('X');
plt.ylabel('Y');
plt.title('My Title');
plt.show();
#---main---
run();
The above python snippet generates a scatterplot of data points around the line segment y=3x, applying a random dY. Take a peek at the scatterplot, and it becomes clear that it follows a linear progression.
Generating a best-fit line is done by:
1) splitting the (x,y) tuples into a list of X values and a list of Y values.
2) plotting a best-fit line using the list of X and list of Y values.
This is done with the following plot command; plotting a red ('r') line, with line width ('lw') of 5:
plt.plot(np.unique(Lx), np.poly1d(np.polyfit(Lx, Ly, 1))(np.unique(Lx)), lw=5, color='r');
The final script looks like this;
#!/usr/bin/python
import random;
import matplotlib.pyplot as plt
import numpy as np;
def run():
L=[];
for x in range(0,500):
y=random.randint(0,200) + x*3;
L.append((x,y));
out = [(float(x), float(y)) for x, y in L];
for i in out:
plt.scatter(i[0],i[1]);
plt.xlabel('X');
plt.ylabel('Y');
plt.title('My Title');
Lx= [ x[0] for x in L ]
Ly= [ x[1] for x in L ]
plt.plot(np.unique(Lx), np.poly1d(np.polyfit(Lx, Ly, 1))(np.unique(Lx)), lw=5, color='r');
plt.show();
#---main---
run();
Cheers.
No comments:
Post a Comment