Linear Regression - Linear and Non-Linear Model using Master Card Stock Data

Master Card Stock Data Performance and linear regression

Problem Statement: Mastercard's stock price fluctuates due to various market forces, including macroeconomic indicators, investor sentiment, and company performance metrics. Understanding these movements is crucial for investors and analysts aiming to make informed decisions.

This study seeks to explore and compare linear regression and non-linear regression techniques in analyzing Mastercard's stock trends. Without fitting a model, the focus will be on assessing whether a linear relationship sufficiently explains stock price variations or whether a non-linear approach offers a more precise representation of underlying patterns.

Step 1 : Import the required libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Step 2 : Read the dataset and perform Exploratory Data Analysis

df_mastercard = pd.read_csv('/Users/Mastercard_stock_dividends.csv')
df_mastercard_new = pd.DataFrame([df_mastercard.Dividends])
df_mastercard_new.head()
df_mastercard_new.T

Step 3 : Assumption of Slope and Intercept Value based on the market index

my_slope = 0.10
my_intercept = 0.5

Step 4 : Calculate the trend or average output of the above dataframe:

df_mastercard_new.head()
df_mastercard_new = df_mastercard_new.T
df_mastercard_new['trend'] = my_slope + my_intercept * df_mastercard_new

Step 5 : Visualize the data and regression line

sns.set_style('whitegrid')
sns.relplot(data=df_mastercard_new, x = 'Dividends', y = 'trend', kind='line')
plt.show()
fig, ax = plt.subplots()

ax.plot(df_mastercard_new.Dividends, df_mastercard_new.trend, color = 'r')
ax.set_xlabel('Dividend')
ax.set_ylabel('Stocks')
plt.show()

Step 6 : Perform Sigma Rule 1 and 2 Rule

my_sigma = 0.152
df_mastercard_new['lwr_obs_68'] = df_mastercard_new.trend - my_sigma
df_mastercard_new['upr_obs_68'] = df_mastercard_new.trend + my_sigma
 
df_mastercard_new['lwr_obs_95'] = df_mastercard_new.trend - 2 * my_sigma
df_mastercard_new['upr_obs_95'] = df_mastercard_new.trend + 2 * my_sigma

Step 7 : Visualize the data and see the uncertainty level of variation from the dataset

fig, ax = plt.subplots()

### True Trend 
ax.plot(df_mastercard_new.Dividends, df_mastercard_new.trend, color = 'crimson', linewidth = 1.5)

### Variation around the trend line using 2 sigma interval 
ax.fill_between(df_mastercard_new.Dividends, df_mastercard_new.lwr_obs_95, df_mastercard_new.upr_obs_95, facecolor = 'crimson', alpha = 0.35)


### Variation around the trend line using 1 sigma interval 
ax.fill_between(df_mastercard_new.Dividends, df_mastercard_new.lwr_obs_68, df_mastercard_new.upr_obs_95,facecolor = 'crimson', alpha = 0.35)

### Set Labels
ax.set_xlabel('Dividend Value')
ax.set_ylabel('Stocks')

### Show the Plot 
plt.show()

Reviewing the non linear relationship, here the trend is linearly related with an unknown intercept and slope

Repeat the same steps like linear (except the formula)  
my_intercept = 0.25
my_slope = -2.25
df_mastercard_new = pd.DataFrame({'x':np.linspace(-3.14159, +3.14159, num=101)})
df_mastercard_new['trend'] = my_intercept + my_slope * df_mastercard_new.x
df_mastercard_new['y'] = rg.normal(loc=df_mastercard_new.trend, scale=my_sigma, size = df_mastercard_new.shape[0])
df_mastercard_new['trend'] = my_intercept + my_slope * np.sin(df_mastercard_new.x)
sns.set_style('whitegrid')
fig, ax = plt.subplots()

ax.plot(df_mastercard_new.x, df_mastercard_new.trend, color = 'crimson',linewidth = 1.5)
ax.set_xlabel('x')
ax.set_ylabel('Trend')
plt.show()
sns.relplot(data=df_mastercard_new, x = 'x', y= 'trend')
plt.show()

Use Case : Considering Sigma Value as 0.55, this gives the uncertainty around the trend line and dividend details

my_sigma = 0.55
df_mastercard_new['obs_lwr_68'] = df_mastercard_new.trend - my_sigma
df_mastercard_new['obs_upr_68'] = df_mastercard_new.trend + my_sigma
df_mastercard_new['obs_lwr_95'] = df_mastercard_new.trend - 2* my_sigma
df_mastercard_new['obs_upr_95'] = df_mastercard_new.trend + 2* my_sigma 
df_mastercard_new
fig, ax = plt.subplots()

### True Trend 
ax.plot(df_mastercard_new.x, df_mastercard_new.trend, color = 'crimson', linewidth = 1.5)

### Variation around the trend -2 sigma interval 
ax.fill_between(df_mastercard_new.x, df_mastercard_new.obs_lwr_95, df_mastercard_new.obs_upr_95, color = 'crimson', alpha = 0.2)

### Variation around the trend - 1 sigma interval
ax.fill_between(df_mastercard_new.x, df_mastercard_new.obs_lwr_68, df_mastercard_new.obs_upr_68, color = 'crimson', alpha = 0.5)

### Plot the data
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.show()

Conclusion :

The above plot visually represents the expected relationship between variables X and Y, with a sine wave-like curve and shaded uncertainty regions. In the context of your linear vs. non-linear regression analysis on Mastercard stock data trends.

  • A graph comparing linear and non-linear fits to stock price trends.

  • Confidence intervals showing variability in predictions.

  • Residual analysis to assess model accuracy.

  • A conclusion on whether a non-linear approach provides a better fit compared to a linear model.

This kind of plot helps visualize whether stock price movements are captured best by a simple linear trend or if a more complex, non-linear model better explains fluctuations