PRACTICE FINAL - SOLUTIONS 1. LaTonya is in the 25% tax bracket. Gail is in the 15% tax bracket. They each itemize their deductions and they each donate $2000 to charity. Compare their true costs for charitable donations. What is the difference between LaTonya's true costs and Gail's true costs? A. No difference B. $200 C. $300 D. $500 Solution: The true cost of a donation is the amount of the donation minus the tax benefit. Since LaTonya is in the 25% tax bracket, a tax-deductible donation of $2000 reduces her taxable income by $2000, and hence reduces her tax by 25% of $2000, which is $500. So her true cost is $2000 minus $500, or $1500. Likewise, Gail's true cost is $2000 minus 15% of $2000, i.e., $2000 minus $300, which is $1700. The difference between $1500 and $1700 is $200, so the answer is B. 2. In its first year in business, TechnoTech Inc. had outlays of $400,000 and receipts of $200,000. In its second year, the company had outlays of $200,000 and receipts of $300,000. Which of the following correctly describes TechnoTech's first two years in business? A. Surplus in both years B. Surplus in the first year, deficit in the second year C. Deficit in the first year, surplus in the second year D. Deficit in both years Solution: In its first year, receipts minus outlays was 200K minus 400K, which is negative, so the company had a deficit. (By the way, don't confuse deficit with debt!) In the second year, receipts minus outlays was 300K minus 200K, which is positive, so the company had a surplus. So the answer is C. 3. A study is done to determine whether there is a correlation between people's weight and their income. 1000 people chosen randomly from the phone book are asked to report their weight and their income. How would you describe this study? A. Single-blind experiment B. Double-blind experiment C. Case-control observational study D. Non-case-control observational study Solution: This is an observational study. Since the subjects do not divide themselves naturally into different groups (both weight and income are continuously varying quantities, rather than discrete differences like gender), it is not a case-control study. So the answer is D. 4. A pollster stops random people at a busy intersection and asks them "Do you support the administration's policies?". What is likely to be the main problem with this survey? A. Selection bias B. Participation bias C. Confounding variables D. Setting and wording Solution: The question is much too vague. Which policies are being referred to? Domestic policy, foreign policy, or what? Furthermore, even a question like "Do you support the administration's foreign policy?" is too broad; are we speaking of Iraq, or the Sudan, or what? And even a more specific question of the form "Do you support the administration's policy on X?" is too broad; people may say "no" for completely different reasons (some might be to the right of the administration and some might be to the left) and therefore should not be lumped together. Finally, people at a busy intersection are often in a rush, and may not have time to think clearly about such a complicated question. So the best answer is D. 5. A pollster conducted a single-question survey in which the question had five different possible responses. She summarized the results of her poll with a histogram, but two of the percentage-labels got erased: | 36% | ___ 28% | | | ___ | | | | | | | | | | ___ 12% | | | | | | | ___ | | | | | | | | | ___ |___|___|___|___|___|___|___|___|___|___|___ What are the two missing labels? A. 18% and 6% B. 12% and 12% C. 8% and 4% D. Any of the above could be correct; there is not enough information Solution: The two missing numbers must add up to 24%, so that the five percentages will add up to 100 (check: 100%-(36+28+12)% = 100%-76% = 24%). This eliminates C (and also eliminates D). Also, one of the missing numbers is about three times the other, since one of the two unlabeled bars is about three times the height of the other. This eliminates B. So the best answer is A. (Alternatively, note that the heights of the histogram bars tells us that one of the numbers should be between 28% and 12%, while the other should be smaller than 12%. Only answer A has this property.) 6. Examine the stack plot shown on page 345. [If this were a real exam, I'd include a photocopy in the exam booklet.] Compare the death rate (per 100,000 people) for cardiovascular disease in 1900 with the combined death rates for pneumonia, tuberculosis, and cancer in that same year. Which of the following best describes the comparison? A. Significantly more people in 1900 died of cardiovascular disease than died of pneumonia, tuberculosis, and cancer combined. B. Significantly fewer people in 1900 died of cardiovascular disease than died of pneumonia, tuberculosis, and cancer combined. C. About the same number of people in 1900 died of cardiovascular disease as died of pneumonia, tuberculosis, and cancer combined. D. One can't tell (there is not enough information). Solution: The death rate from cardiovascular disease in 1900 was about 620 minus 250, or about 370. The combined death rate for all four diseases was about 770. Subtracting 370 from 770, we see that the combined death rate from pneumonia, tuberculosis, and cancer was about 770 minus 370, or about 400. (You could also compute this number by using the plot to estimate the separate death rates for pneumonia, tuberculosis, and cancer, and then adding those death rates, but the method I used here involves less calculation and is more accurate.) Since 400 and 370 are fairly close, the best answer is C. 7. Examine the following scatter diagram, in which the horizontal axis is IQ and the vertical axis is shoe-size: __________________________________________ 11| | | * * | 10| * * | | | 9| * * | | | 8| * * | | | 7| * * | | | |__________________________________________| 80 90 100 110 120 What sort of correlation does this diagram display? A. Positive correlation B. Negative correlation C. No correlation D. Question does not make sense because there is no cause-and-effect relationship between shoe-size and intelligence Solution: There is no clear overall upward or downward trend; the line that best fits the data looks like it would be horizontal, and the fit is not very good (the data do not cluster tightly around a horizontal line). Note that answer D is wrong because a mathematical correlation can exist even when there is no cause-and-effect relationship. So the answer is C. 8. Consider a data set with a histogram of the following shape: 7 | --- 6 | | | 5 | | | --- 4 | | | | | --- 3 | | | | | | | --- --- 2 | --- | | | | | | | | | | 1 | | | | | | | | | | | | | -------------------------------------------------- 1 2 3 4 5 6 What is the relationship between the median and the mode of the data set? A. The median is greater than the mode B. The median is smaller than the mode C. The median equals the mode D. It is impossible to tell from the information provided Solution: This is a right-skewed distribution, so the median is greater than the mode. (More specifically, writing the data set out in full as 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 6, we see that the median is 3 while the mode is 2.) So the answer is A. 9. Compare the standard deviations of two data sets called "X" and "Y": Data set X: 2 ft., 2 ft., 3 ft., 4 ft., 4 ft. Data set Y: 28 in., 28 in., 30 in., 32 in., 32 in. A. Data set X has larger standard deviation than data set Y B. Data set Y has larger standard deviation than data set X C. The two data sets have the same standard deviation D. It is impossible to compare Solution: For data set X, the mean is (2+2+3+4+4 ft.)/5 = 3 ft., so the standard deviation (measured in feet) is the square root of ((2-3)^2+(2-3)^2+(3-3)^2+(4-3)^2 + (4-3)^2)/4 = ((-1)^2 + (-1)^2 + (0)^2 + (1)^2 + (1)^2)/4 = (1+1+0+1+1)/4 = 1; that is, the standard deviation for data set X is sqrt(1) ft. = 1 ft. For data set Y, the mean is (28+28+30+32+32 in.)/5 = 30 in., so the standard deviation (measured in inches) is the square root of ((28-30)^2+(28-30)^2+(30-30)^2+(32-30)^2 + (32-30)^2)/4 = ((-2)^2 + (-2)^2 + (0)^2 + (2)^2 + (2)^2)/4 = (4+4+0+4+4)/4 = 4; that is the standard deviation for data set Y is sqrt(4) in. = 2 in. Since 1 foot is greater than 2 inches, the standard deviation for data set X is greater than the standard deviation for data set Y. So the answer is A. (Note that a good way to check your work is to make sure that the unsquared deviations add up to 0: (-1) + (-1) + (0) + (1) + (1) = 0 (-2) + (-2) + (0) + (2) + (2) = 0 This is definitely a smart thing to do in an exam.) 10. Suppose that the scores of some students on some test are governed by a normal distribution. The mean score of the students was 85 points, and about 70% of the students scored between 80 points and 90 points. What is the best estimate of the standard deviation? A. 2.5 points B. 5 points C. 70 points D. 85 points Solution: According to the 68%-95%-99.7% rule, about 68% (which is pretty close to 70%) of the data in a normally-distributed data set lie within 1 standard deviation of the mean. Since 70% of the students scored between 85-5 points and 85+5 points, where 85 points is the mean score, this matches up with the prediction of the 68%-95%-99.7% rule if the standard deviation is 5 points. So the answer is B. 11. A hospital administrator claims that the mean stay at her hospital is greater than the national average of 2.1 days. She decides to do a study of 81 women at her hospital, and she finds that for the women in her sample, the mean hospital stay after childbirth is 2.3 days. A statistical table tells her that if the average at her hospital were the same as the national average, the probability of observing a sample with a mean of 2.3 days or more would be 0.17. What should she do? A. She should reject the null hypothesis with a statistical significance at the 0.01 level. B. She should reject the null hypothesis with a statistical significance at the 0.05 level (but not at the 0.01 level). C. She should fail to reject the null hypothesis. D. She should accept the null hypothesis. Solution: 0.17 is greater than both 0.01 and 0.05. So the observed outcome of her experiment is consistent with the null hypothesis. Hence she cannot accept the alternative hypothesis. However, in hypothesis testing, we do not accept the null hypothesis; rather, we fail to reject it. So the answer is C. 12. A casino offers two games: "Chump Change" and "Dingbat Dollars". "Chump Change" costs $1 to play. 20% of the time, you double your money (profit = $1); 50% of the time, you just get a dollar back (profit = $0); and 30% of the time, you get nothing back (profit = -$1). "Dingbat Dollars" also costs $1 to play. 30% of the time, you get $3 back (profit = $2); the rest of the time, you get nothing (profit = -$1). Which game is more favorable to you in the long run? That is, which game will result in your winning money at a faster rate (or at least losing money at a slower rate) if you play it long enough and the law of large numbers applies? A. Chump Change is the better game to play B. Dingbat Dollars is the better game to play C. Both games are equally good D. There is not enough information to decide Solution: With Chump Change, your expected profit is 20% * $1 + 50% * $0 - 30% * $1, or -$0.10 (i.e., you lose about ten cents per game on average). With Dingbat Dollars, your expected profit is 30% * $2 - 70% * $1, or -$0.10 (i.e., you lose about ten cents per game on average). Both games lose you money at the same rate in the long run, so the answer is C.