Загрузка страницы

Best Fit Line in 4 Lines of Code — Linear Regression with Python and SciKit-Learn

Hello Everyone! My name is Andrew Fung, in this video, I will be showing you how to generate a line of best fit for a dataset by defining functions on yourself and also using the sklearn library’s function linearRegression() supported by Python. Hope you enjoy this tutorial ;)

#python​ #bestfitline #lineofbestfit #linearregression #machinelearning​

Kaggle’s Weight and Height dataset: https://www.kaggle.com/mustafaali96/weight-height

Installation and Setup!
Installing Jupyter Notebook: https://jupyter.readthedocs.io/en/lat​...
Sklearn linear regression doc: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

Check out my Github!
https://github.com/Andrew-FungKinHo

Timestamps
0:00​ | Introduction
1:19 | Data cleaning
5:49​ | self-defined function method
16:12​ | SciKit-learn method
20:40​ | Out tro

Full code:
———————————————————————————————
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
from statistics import mean
from sklearn import linear_model

def best_fit_line(xs,ys):
slope = (((mean(xs) * mean(ys)) - mean(xs * ys)) / ((mean(xs) * mean(xs)) - mean(xs * XS)))
y_intecept = mean(ys) - slope * mean(XS)
return slope, y_intercept

# load in dataframe and select a portion
df = pd.read_csv('weight-height.csv')
male_df = df[df['Gender'] == 'Male'][:200]

# data cleaning:
male_df['Height'] = male_df['Height'].apply(lambda x: x*2.54)
male_df['Weight'] = male_df['Weight'].apply(lambda x: x*0.45359237)

# convert height and weight columns to lists
height_list = male_df['Height'].tolist()
weight_list = male_df['Weight'].tolist()

# convert lists to numpy lists
xs = np.array(height_list, dtype=np.float64)
ys = np.array(weight_list, dtype=np.float64)
# 1st method: using our own function

# calculated slope and y-intercept of the lists
slope, y_intercept = best_fit_line(xs,ys)

# get the regression line from the calculated slope and y-intercept
regression_line = [(slope * x) + y_intercept for x in XS]

# Making predictions
average_man_height = 175.26
average_man_weight = (slope * average_man_height) + y_intercept

# 2nd method: using Python's sk-learn library

# Create linear regression object
height_weight = linear_model.LinearRegression()

# Train the model using the training sets
height_weight.fit(xs.reshape(-1,1),ys)

# get the regression line using the model
regression_line = height_weight.predict(xs.reshape(-1,1))

# Making predictions
KSI_height = 180
KSI_weight = height_weight.predict(np.array([[KSI_height]]))[0]

# Plot outputs and plot customization
style.use('seaborn')
plt.scatter(xs,ys,label='Data Points', alpha=0.6,color='green',s=75)
plt.scatter(KSI_height,KSI_weight, label='KSI prediction',color='red',s=100)
plt.plot(xs,regression_line,label='Best Fit Line', color='orange',linewidth=4)
plt.title('Height and Weight linear regression')
plt.xlabel('Height (cm)')
plt.ylabel('Weight (kg)')
plt.legend()
plt.show()
———————————————————————————————

Feel free to drop a like and comment if you enjoy and video and let me know if you want me to do other types of programming videos ;) !!!

Видео Best Fit Line in 4 Lines of Code — Linear Regression with Python and SciKit-Learn канала Andrew Fung
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
24 февраля 2021 г. 19:29:27
00:22:29
Яндекс.Метрика