Nt1310 Unit 3 Assignment 1

Words: 1217
Pages: 5

For this assignment we are given a large dataset that contains 29 variables that describe socio-economic and demographic information for 49 census subdivisions in Ontario. Researchers are interested in discovering trends and patterns within the data. Doing a PCA is a way of discovering the possible relationships between all of the data at once. With such a large data set PCA will be able to identify groups of variables that are interrelated and separate the variables that are not related. Using PCA for this dataset would likely be confirmatory, because researchers would likely have some idea which variables will be more related to others. In this case a PCA is simply a method of assessing the validity of their pre conceived thoughts. A question like What area of ontario have higher levels of employment or unemployment could be confirmed with a PCA using this data.

Q1.2 Many of the variables in the dataset
…show more content…
Cluster 3 contains places that mainly scored relatively low in component 1. Component 2 and 3 over lap considerably cluster 2 has a tail that reachers higher then the tail of cluster 3 but also has a lower tail that reaches lower then cluster 3 its mean is -0.2223433 which is higher then the mean of cluster 3 which is -0.5612450.

Cluster 1 contains the places that scored in the middle range neither high or low. The mean is 0.1373814 however it containing some that scored up to about 1. Cluster 2 contains mainly places that scored lower in component 2. The mean is -0.5196499, but there is an out layer that scored bellow -2. Clustrers 1 and 2 over lap significantly and the lower tail of both over lap with cluster 3. Cluster 3 generally contains places that have higher scores in component 2 with its mean of 1.2737383. It hewer over laps slightly with the lower tails of clusters 1 and