The distribution of a variable tells is what calues the variable takes and how often it takes these values.
Frequency Table- Displays Counts
Relative frequency table- shows percents
A histogram is different than a bar graph because the bars are connected.
Two way table – describes two categorical variables, organizing counts according to a row variable and a column variable.
Marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values of tat variable among all individuals described by the table.
A conditional distribution of a variable describes the values of that variable among individuals who have a specific value of another variable.
Association / Causation
Causation – experiment
State: Whats the question that you’re trying to answer?
Plan: How will you go about answering the question? What statistical techniques does this problem call for?
Do: Make graphs and carry out the needed calculations
Conclude: give your practical conclusion in the setting of the real- world problem
The standard deviation Sx measures the average distance of the observations from their mean. It is calculated by finding
The normal curve is a (bell Curve) mean is always in the middle always approximately symmetric -1sd below mean and +1 sd above the mean
Approx 68% of all observations 1 Standard deviation above or below the mean
Approx 95% of all obsevations 2 sd above or below the mean
Approx 99.7% of all observations 3 sd above or below the mean
The population in a statistical study is the entire group of individuals about which we want information.
A sample is the part of the population from which we actually collect info. We use information from a sample to draw conclusions about the entire population.
Step 1: Define the population we want to describe
Step 2: Say exactly what we want to measure
A “sample survey” is a study that uses an organized plan to choose a sample that represents some specific population
Step 3: Decide how to choose a sample from the population
Convenience sample: choosing individuals who are easiest to reach results
The design of a statistical study shows bias if it systematically favors certain outcomes
Voluntary response sample consists of people who choose themselves by responding to a general appeal. Voluntary response samples show bias because people with strong opinions (often in the same direction) are most likely to respond.
Random sampling: the use of chance to select a sample is the central principle of statistical sampling
A simple random sample (SRS) of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected
A table of random digits is a long string of the digits 0,1,2,3,4,5,6,7,8.9 with these properties
Stratified random sample first classify the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRS’s to form the full sample
Cluster sample: first divide the population into smaller groups. Ideally, these clusters should mirror the characteristics of the population. Then choose an SRS of the clusters. All individuals in the chosen clusters are in the sample.
Systematic sampling: Randomly pick a starting point and then pick every nth person
Rely on random sampling:
To eliminate bias in selecting samples from the list of available individuals
The laws of probability
What can go wrong?
Under coverage occurs when some groups in the population are left out of the process of choosing the sample
Nonresponse occurs when an individual chosen for the sample can’t be contacted or refuses to participate
A systematic pattern of incorrect responses in a sample survey leads to response bias
The wording of questions is the most important influence on the answer’s given to a