top of page

Machine Learning Linear Regression Model w/ Python 

This analysis aimed to guide a fictitious ecommerce company in determining whether to prioritise their mobile app or website for customer engagement. The project began with data exploration and visualisation to gain insights into the dataset, which consisted of 500 customer records with features such as "Avg. Session Length," "Time on App," "Time on Website," "Length of Membership," and "Yearly Amount Spent."

 

Exploratory data analysis revealed correlations between variables, a jointplot comparing the Time on Website/App and Yearly Amount Spent demonstrated a stronger correlation for the app, suggesting its superiority over the website for generating revenue. relationships were then examined across the entire dataset and this led to the creation of a linear regression model to predict customer spending based on the available features. The model was trained and evaluated on a test set, resulting in Mean Absolute Error (MAE) of 8.15, Mean Squared Error (MSE) of 104.94, and Root Mean Squared Error (RMSE) of 10.24. These metrics indicate relatively small prediction errors and validate the model's performance.

 

The calculated coefficients highlighted the importance of each feature in influencing customer spending. Specifically, a unit increase in "Avg. Session Length" was associated with an increase of approximately $26.06 in spending, while a unit increase in "Time on App" was associated with an increase of about $38.61. Furthermore, a unit increase in "Time on Website" led to a minor increase of $0.22, and a unit increase in "Length of Membership" was associated with a substantial increase of $61.33.

 

Given the coefficients and model performance, the analysis suggests that the company should focus more on their mobile app rather than the website, as the app has a more pronounced positive impact on customer spending. However, it's essential to consider other factors at play and potential qualitative insights from users before making a definitive decision. Continual monitoring and further analysis would provide a more comprehensive understanding.

bottom of page