New Jersey Mathematics Curriculum Framework

## STANDARD 12 - PROBABILITY AND STATISTICS

 All students will develop an understanding of statistics and probability and will use them to describe sets of data, model situations, and support appropriate inferences and arguments.

## Standard 12 - Probability and Statistics - Grades 9-12

### Overview

Students can develop a strong understanding of probability and statistics from consistent experiences in classroom activities where a variety of manipulatives and technology are used. The key components of this understanding in probability for middle school students, as identified in the K-12 Overview, are: probability terms, the concept of the probability of an event, predicting and determining probabilities, expected value, the relationship between theoretical and experimental probabilities, and compound events. In statistics, the key components are: data collection, organization, and representation, sampling, central tendency, variance and correlation, and analysis and inference.

The field of statistics is relatively new. Beyond the work of scientists, Florence Nightingale was the great pioneer in gathering and analyzing statistical data for public health questions. During the great cholera epidemic of 1854 in London, England, statistics on the prevalence of cholera cases in various London neighborhoods were used to deduce that the cholera originated with a single well. In our own century statistics touches all of us through such diverse means as statistical quality control in industry, advertising claims, pre-election polls, television show ratings, and weather forecasts. To be successful members of present day society, high-school graduates need an understanding of statistics and probability which formerly was rare even among college graduates.

By the time students enter high school, they should have mastered basic descriptive statistical methods. On the basis of their varied experience, they should be able to set up a study, gather the data, and appropriately analyze and report their findings. Throughout grades 9 to 12, students should have numerous opportunities to continue to practice these skills in a variety of ways, and also to extend these skills, in connection with their growth in other mathematical areas. As students learn new algebraic functions, they might revisit a problem they had previously modeled linearly and apply a different model. For example, they may have linearly modeled the series of winning times of the men's Olympic marathon but now understand that there would probably be a limiting time and so attempt to fit a quadratic or logarithmic curve instead. Where appropriate, the content should be developed through a problem-centered approach. For example, if students are required to generate a report on two sets of data which have the same measures of central tendency only to find later they have very different variance, they should recognize the need for some way to identify that difference.

John Allen Paulos, in his book, Innumeracy, cites numerous problems associated with a lack of understanding of probability. If people are to make appropriate decisions, then they must understand the relationship of probability to real situations and be able to weigh the consequences against the odds. As with statistics, probability needs to be experienced, not memorized. Work done at this level should provide insight into the use of probability and probability distributions in a variety of real-world situations. The normal curve presents interesting opportunities to examine uses and abuses of mathematics.

Students should have access to appropriate technology for their work in probability and statistics, not only to simplify calculation and display charts and graphs, but also to generate appropriate data for activities and projects. They should make use of data taken from the Internet and CD-ROMs, and simulate experiments with Calculator Based Laboratories. Whenever possible, real data gathered from school, the community, or cooperating businesses should be used.

Probability and statistics offers a rich opportunity to integrate with other mathematics content and other disciplines. This content provides the opportunity to generate the numbers and situations which should be used in other areas such as geometry, algebra, functions, and discrete mathematics. The goal to have students become effective members of a democratic society requires them to practice and participate in decision-making experiences. The ability to make intelligent decisions rests on an understanding of statistics and probability, and students should regularly integrate this content with their experiences in social studies, science, and other disciplines.

The topics that should comprise the probability and statistics focus of the mathematics program in grades 9 through 12 are:

designing, conducting, and interpreting statistical work to solve problems
analyzing data using range, measures of central tendency, and dispersion
applying probability dispersions in real situations
evaluating arguments based upon their knowledge of sampling and data analysis
interpolating and/or extrapolating from data using curve fitting
using simulations to estimate probabilities
determining expected values
using the law of large numbers

## Standard 12 - Probability and Statistics - Grades 9-12

### Indicators and Activities

The cumulative progress indicators for grade 12 appear below in boldface type. Each indicator is followed by activities which illustrate how it can be addressed in the classroom in grades 9, 10, 11, and 12.

Building upon knowledge and skills gained in the preceding grades, experiences in grades 9-12 will be such that all students:

17. Estimate probabilities and predict outcomes from actual data.

• In a standard class test, students are asked to compute the probability that a given raffle ticket for a senior class raffle to raise money for the senior trip will win a prize. The class will be printing 500 tickets that they will sell for \$1 each. First prize is a stereo worth \$150. Second prize is a \$100 shopping spree in the local Gap store. Third prize is a \$50 gift certificate to The Golden Goose restaurant. There are ten fourth prizes of a commemorative T-shirt worth \$8 each. Students also compute the expected value of each ticket.

• Students determine the area of an irregular closed figure drawn on a large sheet of paper using the Monte Carlo method: Each person in the group drops a handful of pennies over their shoulder (without looking) onto the paper containing the figure. They count the number of coins on the paper (total shots) and the number within the figure (hits). They thus produce the ratio of hits to total shots and multiply this fraction by this area of the paper to estimate the area of the figure.

• Students work through the On the Boardwalk lesson that is described in the Introduction to the Framework. In this lesson they explore the probability that a quarter thrown onto a rectangular grid will land entirely within one of the squares on the grid, and then discuss how changing the size of the squares will affect the probability.

• A point P inside a square is selected at random and is used to form a triangle with vertices A and B of the square. Students determine the probability that the triangle is acute using a simulation and a theoretical calculation.

18. Understand sampling and recognize its role in statistical claims.

• While studying United States history, students read about the prediction in the 1936 presidential race that Alfred Landon would defeat incumbent President Franklin Delanor Roosevelt. They raise questions as to why that prediction was so far off and research how TV stations can forecast winners of some elections with a very small percentage of the voting results reported. Students contact local radio and TV stations and newspapers to discover how they determine their population sample.

19. Evaluate bias, accuracy, and reasonableness of data in real-world contexts.

• After reading the chapter on sampling in the book How to Lie With Statistics by Darryl Huff, students bring in ads, graphs, charts, and articles from newspapers which all makestatements or claims allegedly based on data. Students examine the articles for information about the sample and identify those claims which may have little or no substantiation. They also discuss how the sample populations chosen could have influenced the outcomes.

• Students take statements such as "50% of the students failed the test," and "4 out of 5 dentists recommend" and discuss what data they would need to know in order to judge if the conclusions were reasonable. How many students took the test? How many dentists were queried? How were the students or dentists selected? What factors can be identified which would bias the results?

20. Understand and apply measures of dispersion and correlation.

• Students are presented with data gathered by an archaeologist at several sites. The data identifies the number of flintstones found at each site and the number of charred bones. The archaeologist claimed that the data showed that the flintstones were used to light the fires that charred the bones. Students produce a scatterplot, find the correlation between the two sets of figures, and use their work to support or criticize the claim.

• As an assessment activity using their journals, students respond to the claim that children with bigger feet spell better. They discuss whether they believe the claim is true, how statistics might have led to this claim, and whether it has any importance to a philosophy of language teaching.

21. Design a statistical experiment to study a problem, conduct the experiment, and interpret and communicate the outcomes.

• Based on a discussion among some members of the class, a question arises as to which are the most popular cars in the community. The students work in cooperative groups to design an experiment to gather the data, analyze the data, and design an appropriate report format for their results.

• Intrigued by the question How long would it take dominoes set up one inch apart all the way across the room to fall?, the class designs an experiment to gather data on smaller sets of dominoes and then extrapolates to estimate the answer.

• Students have just finished a unit in which they discussed the capturerecapture method for estimating the population of wildlife. Part of their assessment for the unit is a project where they work in groups to design and conduct a simulation of the capturerecapture method. One group uses the method to determine the number of lollipops in a large bag.

22. Make predictions using curve fitting and numerical procedures to interpolate and extrapolate from known data.

• Students are presented with this data comparing a student's test grade to the number of hours each studied.

```Hours   1   2   3    4    4    6    8    9 10 10 12 12
Grade  60  55  65   65   77   80   83   80 75 90 72 80
```

Earlier in the year they had produced a line of best fit for the data, but they had recognized that it was not a good model for the data. Now the students use calculators to help them fit a quadratic curve to the data and discuss the advantages it has over thestraight line.

• Students conduct an experiment where they suspend a weight on a string from a hook in the doorway. They swing their homemade pendulum and time how long it takes for it to swing 10 times. They had performed this experiment in 8th grade and used the medianmedian line fit method to model the data. In this revisitation of the problem, the teacher insists they use very short lengths and very long lengths in addition to various ones in between. When the data is graphed, it becomes apparent that the data is not linear and would fit a quadratic curve better. (The median-median line, available on many calculators which have statistics capabilities, is found by dividing the data points on the x-y plane into three equal sets, grouped by x-value, finding a single point for each set whose coordinates are the medians of the respective coordinates of the points in the set, connecting the first and third points by a straight line, and shifting this line 1/3 of the way toward the second point. See Contemporary Precalculus Through Applications.)

• Students perform an Introductory Physical Science experiment where water is cooled by adding ice cubes and stirring. A Calculator Based Lab temperature probe is attached to a graphing calculator which is programmed to gather the data. Students (or each group of students) link their calculators to the original one to transfer the data to their calculators. They use the statistics functions to perform a quadratic fit, an exponential fit, and a logarithmic fit, and use the function graph capabilities to determine which is the best model.

• Students work through the What's My Line unit described in the Keys to Success in the Classroom chapter of the Framework. They use median median and regression lines to estimate the height of a person whose thigh bone was found in a dig.

23. Use relative frequency and probability, as appropriate, to represent and solve problems involving uncertainty.

• After a unit where dependent and independent events were detailed, students are challenged by a problem containing this excerpt from The Miami Herald of May 5, 1983.

• An airline jet carrying 172 people between Miami and Nassau lost its engine oil, power, and 12,000 feet of altitude over the Atlantic Ocean before a safe recovery was made.

When all three engines' low oil pressure warning lights all lit up at nearly the same time, the crew's initial reaction was that something was wrong with the indicator system, not the oil pressure.

They considered the possibility of a malfunction in the indication system because it's such an unusual thing to see all three with low pressure indications. The odds are so great that you won't get three indications like this. The odds are way out of sight, so the first thing you would suspect is a problem with the indication system.

Aviation records show that the probability of an engine failure in any particular hour is about 0.00004. If the failures of three engines were independent, what would the probability be of them failing within one hour? Discuss why the speaker in the article would refer to such a probability as "way out of sight." Discuss situations which might make the failures of three engines not independent events.

• Students keep a record of their trips through the town and whether or not they have to stop at each of the four traffic lights. After one month, the data is grouped and studied. Theyuse their data to determine whether the timing of the lights is independent or not.

• While discussing the issue of mandatory drug testing in social studies, students examine the probability of misdiagnosing people as having AIDS with a test that would identify 99% of those who are true positives and misdiagnose 3% of those who don't have AIDS. They examine situations where the prevalence of the disease is 50%, 10%, and 1% using 100,000 people as a base. They discuss the fact that, at the 1% level, 75% of the people identified as having AIDS would be false positives, the implications that fact has on mandatory testing, and potential ways to improve the predictive value of testing.

24. Use simulations to estimate probabilities.

• Students derive the theoretical probability of winning the New Jersey Pick 6 lottery and then write a computer program to simulate the lottery. The students enter the winning numbers and the computer generates sets of 6 numbers until it hits the winning combination. The computer prints out the number of sets generated including the winning one. Students run the program several times, attempting to verify experimentally the theoretical probability they derived.

25. Create and interpret discrete and continuous probability distributions, and understand their application to realworld situations.

• Students work on a project where they pick one form of insurance (life, car, home), and determine the variables which affect the premiums they would need to pay for this type of insurance and what it would cost for them to obtain it. Using their research, they write an essay summarizing how insurance companies use statistics and probabilities to determine their rates.

• An article in Consumer Reports indicates that 25% of 5lb bags of sugar from a particular company are underweight. The class works with the local supermarket to develop and perform a consumer research project. Each group is given a commodity to study (e.g., potato chips, sugar). They design a method for randomly selecting and testing whether the product matches the claimed specifications or not. They use their data to determine the probability that a randomly selected bag would be underweight.

• Students repeatedly extracted five marbles from a bag containing 10 red and 10 blue marbles, and each time record the number of marbles of each color obtained. They combine the data for the entire class, tabulating the number of times there were 0, 1, 2, 3, 4, and 5 red marbles, and the percentages for each number. They compared their percentages to the theoretical percentages for this binomial distribution, and make the connection to the fifth row of Pascal's triangle.

26. Describe the normal curve in general terms, and use its properties to answer questions about sets of data that are assumed to be normally distributed.

• Students describe a typical student in the school. To do this, they first select a random sample of 30 students in their school. They then survey their sample for information they believe necessary to identify what would be "typical." Finally, they use appropriate displays and descriptive statistics to support their representation of a typical student.

• Students are introduced to the "central limit theorem" through this problem:

A worker on the assembly line at Western Digital is involved in industrial sabotage byweakening a soldering joint that causes a hard drive to fail after 5 hours of use. At his station, he actually comes in contact with 30% of the drives produced. The other 70% will last 100 hours. If they are packed randomly in boxes of 36, what would be the average expected lifespan of the drives in the box?

Students prepare simulations of the problem by repeatedly extracting 36 cubes at random from a bag containing 30 yellow and 70 red cubes and calculating the average expected lifespan for each selection. They discover that the answers fall into a normal curve with a mean of approximately 70 hours.

• Students are given the administrator's summary of the school's standardized tests. Each group is given one area on which to focus. They prepare a presentation they would give to the Board of Education discussing the comparisons between local norms, national norms, suburban norms, urban norms, and independent norms using their understanding of normal distribution, percentile ranks, and graphical displays.

27. Understand and use the law of large numbers (that experimental results tend to approach theoretical probabilities after a large number of trials).

• Students are given two dice, each a different color and roll them repeatedly. For each roll, they record the result for each individual die as well as the total. After a large number of rolls they compare their relative frequencies to the expected outcomes. Then they combine the totals for the entire class and compare the experimental results with the theoretical predictions.

• Students are presented with a paper containing the following gambler's formula: When playing roulette, bet red. If red does not win, double the bet on red. Continue in this manner. They evaluate whether the formula makes sense, identify potential problems, and limitations, and discuss the fallacy that the odds improve for red to appear on the next roll every time red doesn't win.

### References

Burrill, Gail, et al. Data Analysis and Statistics Across the Curriculum. A component of the Curriculum and Evaluation Standards for School Mathematics Addenda Series, Grades 9-12. Reston, VA: National Council of Teachers of Mathematics, 1992.

Huff, D. How to Lie with Statistics. New York: Norton, 1954.

Paulos, J. A. Innumeracy: Mathematical Illiteracy and its Consequences. New York: Hill and Wang, 1988.

The North Carolina School of Science and Mathematics. Contemporary Precalculus Through Applications. Providence, RI: Janson Publications, 1991.

### On-Line Resources

http://dimacs.rutgers.edu/nj_math_coalition/framework.html/

The Framework will be available at this site during Spring 1997. In time, we hope to post additional resources relating to this standard, such as grade-specific activities submitted by New Jersey teachers, and to provide a forum to discuss the Mathematics Standards.