Linear Regression - Statsmodels

Linear Regression - StatsModels

Linear Regression using Statsmodels:

Reference Books : https://amzn.to/3XxIUh0

Scenario :

You are working with car insurance firm in Sweden and are looking to gain insight into insurance payouts. You have obtained a publicly available dataset regarding a competitor, which covers the number of claims in geographical area and the total payouts in the area.

Task :

Build a linear regression model to compare the total payout for a given number of claims. This will allow us to make a comparison with our competitor.

###Importing Required Packages and Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plot
import statsmodels.api as stats

#Reading the datafile
insurance_df = pd.read_csv("/Users/maheshg/Downloads/SwedishMotorInsurance.csv")

print(insurance_df.shape)
#Assessing the data:
insurance_df.head()
print(insurance_df.head(n= 10))
print(insurance_df['Claims'])

insurance_df_new = insurance_df['Claims'],['Payments']
print(insurance_df_new)
insurance_df_new.head()
###Visualizing the data
plot.scatter(insurance_df.Claims,insurance_df.Payment)
plot.xlabel('Claims')
plot.ylabel('Payment[100k Kroner]')
plot.show()
###Fitting the regression model :
y_insurance = insurance_df.Payment
x_insurance = stats.add_constant(insurance_df['Claims'])

model_insurance = stats.OLS(y_insurance,x_insurance)
result_insurance = model_insurance.fit()

print(result_insurance.summary())

###Plotting the results:
plot.scatter(insurance_df.Claims, insurance_df.Payment, label = 'Observered')
plot.show()


# #Plot Combind Chart:
plot.xlabel("Claims")
plot.ylabel("Payment [100k Kroner]")
plot.legend()
plot.show()