The information that was provide for analysis consisted of sales per stores (which is the y variable), along with 33 x variables such as ethnicity, income, population, home owners, renters, etc. There were several observations done to analyze the data which consisted of using scatter plots, running correlation and residual models. By performing these models I was able to then identify the most significant of the remaining variables and then remove the variables that are insignificant.
The first steps of my analysis consisted of me creating a scatter plot for sales vs comtype. By doing this I was able to determine the variances between the different comtypes and identify which comtypes were significant and insignificant. It is clear based on the scatter plot below that the comtypes 1 and 2 have significantly higher sales than comtype 7 while the other comtypes are within the same range. Based on this information I know that comtypes 1, 2, and 7 are significant variables.