Michelle Aviles Final Project Essay

Submitted By MIchelleAviles
Words: 937
Pages: 4

Multiple Regression Project

Michelle Aviles
MGSC 6200 – Information Analysis 50840 Section 4
Lead Faculty – Nizar Zaarour
July 5, 2015

Introduction – The purpose of this project is to use census data and individual store data to predict sales potential between two possible new store locations site A and site B.
This project will be exploring an alternative way to model to estimate sales from what the real estate department is currently doing. This project will be using a multiple regression model.
Data-
This project was provided with data that will be used when the multiple regression model is done. This file contains information on 250 Pam and Susan stores.
Current stores data includes demographic data, economic data, store size and sales. The information is collected from the trading zone, which is a 15 minute driving radius from each store.
The predicted sales units are in the $1000 and non-significant data was removed. The competitive types 1, 2 and 7 are statistically significant and were added to the existing 37 variables from the data provided data. The data from both site A and site B will be reviewed when estimating potential sales
Results and discussion – My first step was look at the data to see what data was statistically significant and what could be left out. I used scatter plots and correlation tables to show what data was necessary. Question 1 (From textbook) According to the table below the nature of location sites that are likely to have higher sales will have a high Spanish speaking population, low % of dryers, low % of freezers, high population with little competition, low % of home ownership and a low % of households that have air conditioners. Table 1 lists out the correlation to sales the variables have.

[Table1]
Correlation to Sales:
% Black
.275
% Spanish speaking
.547
Education (in years)

0-8
.486
9-11
.008
12 years
-.238
12 + years
-.218
Population
.600
Average family size
-.280
Square feet (1000’s)
.349
Annual Sales (1000’s)
1.000
% good stocked (hard goods)
.016
Competetive type category number
-.660
Median years family income (in $)
-.322
Median rent per month ($)
-.394
Median home value ($)
.030
% homeowners
-.690
% no cars
.701
% 1 car
.010
% households with TV
-.058
% households with washer
-.562
% households with dryer
-.657
% households with dishwasher
-491
% households with air conditioning
-.290
% households with freezer
-.639
% households with second home
-.287
Incomes (in 1000’s)

0-10
.615
10-14
.614
14-20
.265
20-30
.310
30-50
-.404
50-100
-.107
>100
.010

Question 2 – The ‘competetive type’ classification method is a helpful method. I believe it is just as strong as the original method. One of the first steps was to create a scatterplot of the sales using the 7 competitive types. Variables 1, 2 and 7 show significance and 3-6 can be omitted from the rest of the report. Chart 1 is a scatterplot of the 7 competitive types (in 1000’s)
CHART 1 PROJECT!!!!!

Question 3 – Based on the charts and table below (Chart 2 and Table 2) I would suggest going with site A as a new stores location. I used the most statistically significant variables when running the regression model and when creating the table. I took the coefficients from Chart 2 and plugged those numbers into Table 2 so I could compare site A and site B to see which site would potentially have higher sales. Site A had an estimated sales total of $22,995.46 and Site B had an estimated sales total of $12,702.03. I would also recommend using the 2nd model based on the