Best Fit Line in 4 Lines of Code — Linear Regression with Python and SciKit-Learn
Hello Everyone! My name is Andrew Fung, in this video, I will be showing you how to generate a line of best fit for a dataset by defining functions on yourself and also using the sklearn library’s function linearRegression() supported by Python. Hope you enjoy this tutorial ;)
#python #bestfitline #lineofbestfit #linearregression #machinelearning
Kaggle’s Weight and Height dataset: https://www.kaggle.com/mustafaali96/weight-height
Installation and Setup!
Installing Jupyter Notebook: https://jupyter.readthedocs.io/en/lat...
Sklearn linear regression doc: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
Check out my Github!
https://github.com/Andrew-FungKinHo
Timestamps
0:00 | Introduction
1:19 | Data cleaning
5:49 | self-defined function method
16:12 | SciKit-learn method
20:40 | Out tro
Full code:
———————————————————————————————
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
from statistics import mean
from sklearn import linear_model
def best_fit_line(xs,ys):
slope = (((mean(xs) * mean(ys)) - mean(xs * ys)) / ((mean(xs) * mean(xs)) - mean(xs * XS)))
y_intecept = mean(ys) - slope * mean(XS)
return slope, y_intercept
# load in dataframe and select a portion
df = pd.read_csv('weight-height.csv')
male_df = df[df['Gender'] == 'Male'][:200]
# data cleaning:
male_df['Height'] = male_df['Height'].apply(lambda x: x*2.54)
male_df['Weight'] = male_df['Weight'].apply(lambda x: x*0.45359237)
# convert height and weight columns to lists
height_list = male_df['Height'].tolist()
weight_list = male_df['Weight'].tolist()
# convert lists to numpy lists
xs = np.array(height_list, dtype=np.float64)
ys = np.array(weight_list, dtype=np.float64)
# 1st method: using our own function
# calculated slope and y-intercept of the lists
slope, y_intercept = best_fit_line(xs,ys)
# get the regression line from the calculated slope and y-intercept
regression_line = [(slope * x) + y_intercept for x in XS]
# Making predictions
average_man_height = 175.26
average_man_weight = (slope * average_man_height) + y_intercept
# 2nd method: using Python's sk-learn library
# Create linear regression object
height_weight = linear_model.LinearRegression()
# Train the model using the training sets
height_weight.fit(xs.reshape(-1,1),ys)
# get the regression line using the model
regression_line = height_weight.predict(xs.reshape(-1,1))
# Making predictions
KSI_height = 180
KSI_weight = height_weight.predict(np.array([[KSI_height]]))[0]
# Plot outputs and plot customization
style.use('seaborn')
plt.scatter(xs,ys,label='Data Points', alpha=0.6,color='green',s=75)
plt.scatter(KSI_height,KSI_weight, label='KSI prediction',color='red',s=100)
plt.plot(xs,regression_line,label='Best Fit Line', color='orange',linewidth=4)
plt.title('Height and Weight linear regression')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.legend()
plt.show()
———————————————————————————————
Feel free to drop a like and comment if you enjoy and video and let me know if you want me to do other types of programming videos ;) !!!
Видео Best Fit Line in 4 Lines of Code — Linear Regression with Python and SciKit-Learn канала Andrew Fung
#python #bestfitline #lineofbestfit #linearregression #machinelearning
Kaggle’s Weight and Height dataset: https://www.kaggle.com/mustafaali96/weight-height
Installation and Setup!
Installing Jupyter Notebook: https://jupyter.readthedocs.io/en/lat...
Sklearn linear regression doc: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
Check out my Github!
https://github.com/Andrew-FungKinHo
Timestamps
0:00 | Introduction
1:19 | Data cleaning
5:49 | self-defined function method
16:12 | SciKit-learn method
20:40 | Out tro
Full code:
———————————————————————————————
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
from statistics import mean
from sklearn import linear_model
def best_fit_line(xs,ys):
slope = (((mean(xs) * mean(ys)) - mean(xs * ys)) / ((mean(xs) * mean(xs)) - mean(xs * XS)))
y_intecept = mean(ys) - slope * mean(XS)
return slope, y_intercept
# load in dataframe and select a portion
df = pd.read_csv('weight-height.csv')
male_df = df[df['Gender'] == 'Male'][:200]
# data cleaning:
male_df['Height'] = male_df['Height'].apply(lambda x: x*2.54)
male_df['Weight'] = male_df['Weight'].apply(lambda x: x*0.45359237)
# convert height and weight columns to lists
height_list = male_df['Height'].tolist()
weight_list = male_df['Weight'].tolist()
# convert lists to numpy lists
xs = np.array(height_list, dtype=np.float64)
ys = np.array(weight_list, dtype=np.float64)
# 1st method: using our own function
# calculated slope and y-intercept of the lists
slope, y_intercept = best_fit_line(xs,ys)
# get the regression line from the calculated slope and y-intercept
regression_line = [(slope * x) + y_intercept for x in XS]
# Making predictions
average_man_height = 175.26
average_man_weight = (slope * average_man_height) + y_intercept
# 2nd method: using Python's sk-learn library
# Create linear regression object
height_weight = linear_model.LinearRegression()
# Train the model using the training sets
height_weight.fit(xs.reshape(-1,1),ys)
# get the regression line using the model
regression_line = height_weight.predict(xs.reshape(-1,1))
# Making predictions
KSI_height = 180
KSI_weight = height_weight.predict(np.array([[KSI_height]]))[0]
# Plot outputs and plot customization
style.use('seaborn')
plt.scatter(xs,ys,label='Data Points', alpha=0.6,color='green',s=75)
plt.scatter(KSI_height,KSI_weight, label='KSI prediction',color='red',s=100)
plt.plot(xs,regression_line,label='Best Fit Line', color='orange',linewidth=4)
plt.title('Height and Weight linear regression')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.legend()
plt.show()
———————————————————————————————
Feel free to drop a like and comment if you enjoy and video and let me know if you want me to do other types of programming videos ;) !!!
Видео Best Fit Line in 4 Lines of Code — Linear Regression with Python and SciKit-Learn канала Andrew Fung
Показать
Комментарии отсутствуют
Информация о видео
Другие видео канала
![Machine Learning Tutorial Python - 2: Linear Regression Single Variable](https://i.ytimg.com/vi/8jazNUpO3lQ/default.jpg)
![How to do Multiple Linear Regression in Python| Jupyter Notebook|Sklearn](https://i.ytimg.com/vi/WngoqVB6cXw/default.jpg)
![How to learn to code (quickly and easily!)](https://i.ytimg.com/vi/R2pIutTspQA/default.jpg)
![Intro to Data Visualization in Python with Matplotlib! (line graph, bar chart, title, labels, size)](https://i.ytimg.com/vi/DAQNHzOcO5A/default.jpg)
![Polynomial fit using Numpy module in Python](https://i.ytimg.com/vi/-Ovf9yAFhqI/default.jpg)
![Linear Regression Model Techniques with Python, NumPy, pandas and Seaborn](https://i.ytimg.com/vi/EMIyRmrPWJQ/default.jpg)
![Automate Multiple Sheet Excel Reporting - Python Automation Tutorial | Full Code Walk Through (2019)](https://i.ytimg.com/vi/1Kcco6koC34/default.jpg)
![Drawing Lines of Best Fit](https://i.ytimg.com/vi/AMdY5YbKqA4/default.jpg)
![Machine Learning Tutorial Python - 3: Linear Regression Multiple Variables](https://i.ytimg.com/vi/J_LnPL3Qg70/default.jpg)
![Python Video 07d: Plotting Contour and Surface Plots with Matplotlib](https://i.ytimg.com/vi/xd2sZ8rXLZI/default.jpg)
![Slope of Line | Python Program | Calculate slope of a line when two points are given](https://i.ytimg.com/vi/QNt1Gj51zbI/default.jpg)
![Seaborn regplot | What is a regplot and how to make a regression plot in Python Seaborn?](https://i.ytimg.com/vi/Smh7ujDbYyg/default.jpg)
![Curve Fitting Plots in Python](https://i.ytimg.com/vi/PuNADWh4X5s/default.jpg)
![plotting maps with geopandas and matplotlib](https://i.ytimg.com/vi/5G-1k4CNChI/default.jpg)
![OLS Simple Linear Regression _ Python](https://i.ytimg.com/vi/g4VozsFkhOI/default.jpg)
![Support Vector Machines: A Visual Explanation with Sample Python Code](https://i.ytimg.com/vi/N1vOgolbjSc/default.jpg)
![Comparing machine learning models in scikit-learn](https://i.ytimg.com/vi/0pP4EwWJgIU/default.jpg)
![Data Analysis with Python Course - Numpy, Pandas, Data Visualization](https://i.ytimg.com/vi/GPVsHOlRBBI/default.jpg)
![How to plot Bitcoin Prices in real time (part 1) — Dynamic data fetching with Python and Matplotlib](https://i.ytimg.com/vi/pcB2Kc7MS7Y/default.jpg)
![Tutorial 28- Ridge and Lasso Regression using Python and Sklearn](https://i.ytimg.com/vi/0yI0-r3Ly40/default.jpg)