Aarian Charania

Professor Habibullah

Section: 8am

Introduction

The independent research company Consumer Research, Inc. conducted research and collected data of annual income and household size, and annual credit card charges. In this report I plan find how much our independent variables (household size and income) affects our dependent variable annual credit card charges. By using descriptive statistics such as mean and standard deviation in order to look at each variable separately. After I will use linear regression to compare two or three variables and see how each variable affects one another. In the end I plan to decide which variable is the most effective in predicting annual credit charges. As the research is limited throughout the report suggestions for improvement such as implementation of new independent variables will be included. This will hopefully lead to a improvement in the model which we already have created by using the data given.

Questions

For all three data sets the data is spread significantly. This is due to standard deviation being a huge percentage of the mean. In relation to skewness, all three have different distributions; for example the income data is uniformly distributed. However the household size data is skewed to the right, this means that the mean is greater than its median. The amount charged seems to be normally distributed. Amount charged has the greatest mean and household size has the smallest.

(All data used in this analysis can be found in the appendix)

Amount Charged vs. Annual Income

According to the equation the Y value would be the amount charged (dependent variable) and the X value would be the annual income (independent variable)

Looking at the results of regression we see that the Model is Significant with more than a 95% confidence level. According to the Residual analysis we learn that about 40% of the variation in amount charged comes from annual income.

Amount Charged vs. Household Size

Once again in this equation our Y value would be the amount charged (dependent variable) and the X value is the household size (independent variable). Once again our model is significant with more than a 95% confidence level. According to the residual analysis the household size explains about 60% of the variation in amount charged.

According to the data we found Household size is a better to predict amount charged than Annual income. This is because household size explains about 60% of the variation in amount charged while annual income only explains 40%.

When developing a regression equation using two independent variables my results changed as expected. Would be the new equation.

Looking at the equation the is the predicted Y value for Amount charged, X1 is the first independent variable being Annual Income and X2 represents the second independent variable Household Size. Once again the model is significant with more than a 95% confidence level. Also household size and annual income explain about 80% of the variation in Amount charged.

In order to find the predicted annual credit card charge for a three-person household with an annual income of $40,000 we plug in our values and solve for .

The predicted annual credit charge for this situation would be about $3700 as shown in the equation above.

As mentioned before in the introduction there is always room for improvement in the model, adding new independent variables to the linear regressions tests can do this. One independent variable that could be added is the Average age of the household. This variable will not only be easy to collect it will add some additional information to the model. The second independent variable that could be used is the number of credit cards. The high the amount of credit cards should lead to a higher total amount charged. Not only will it impact the amount charged but it will also be important…