Nt1310 Unit 4 Data Analysis - 4345 Words

Data Analysis
I ran a Cochran’s Q test on the obtained classification data on an excel sheet. The Cochran’s Q test assumed a nominal independent variable of three or more treatments and dichotomous (binary) dependent variable, which were representative of the four independent variable treatments (KNN, SVM, CDA, and k-means pattern recognition algorithms) and two dependent variable outcomes (correctly classified and incorrectly classified) (Laerd Statistics, n.d.). The Cochran’s Q test also required an assumption of a large enough sample size for statistically significant differences to exist between proportions due to actual differences in the treatments rather than random chance (Laerd Statistics, n.d.). I expected 75-95% classification rate, which prompted a sample size of about two hundred per experimental treatment group. The final assumption of Cochran’s Q test called for a randomized sample of the population (Laerd Statistics, n.d.). Due to the process of applying convenience sampling, it was not possible to meet the assumption of a randomized sample as a randomized sample was impractical and nearly impossible to have obtained. Though the sample was representative of the population as I used four …show more content…
Table 3 listed the results of each test, showing statistical significance for a difference in the detection accuracy between the SVM and CDA pattern recognition accuracies for only Alternaria (p = 0.003418). I rejected the null hypothesis of no difference in the detection accuracy in favor of the alternative hypothesis, showing that there is a difference in the detection accuracy between the SVM and CDA pattern recognition algorithms for Alternaria. I failed to reject the null hypotheses of no difference in the detection accuracy between the SVM and CDA pattern recognition algorithms for Aspergillus, Cladosporium, and

Show More