Penny Pinching: Statistical Treatment of Data

By: Julie Wilhelmsen

Objective:

The objective of experiment one is to see if the year a penny is produced in mint, relates to the weight of the penny. Each lab group had a set of 20 pennies, in which we collected the weight and the year of each penny. The information obtained from each lab group will help us create some statistical data. By using different calculations, I will be able to determine if there is a relation between a penny’s weight and year minted. The calculations I will use to do this are: mean, standard deviation, Q-test, and t-test. I predict that pennies that are newly produced will weigh less than pennies produced many years ago.

Data:

Table 1: Individual Lab Group Data

Penny #	Year	Mass (g)
1	1980	3.0841
2	2005	2.4948
3	1982	3.0996
4	1996	2.5038
5	1993	2.4829
6	1988	2.4724
7	1984	2.5723
8	1995	2.5191
9	1990	2.5269
10	1979	3.1068
11	1999	2.4890
12	1995	2.4901
13	1984	2.5202
14	1988	2.5154
15	1983	2.5108
16	1974	3.0823
17	1991	2.4502
18	1993	2.4807
19	1985	2.4657
20	1963	3.0910

Table 2: Class Data

Penny #	Year	Mass (g)	Penny #	Year	Mass (g)
1	1988	2.4444	41	1994	2.4624
2	1993	2.5002	42	1996	2.4743
3	1990	2.4990	43	1991	2.5226
4	1990	2.4711	44	1979	3.0753
5	1959	3.0603	45	1968	3.0454
6	1994	2.4732	46	1990	2.4991
7	1993	2.4857	47	1995	2.4516
8	2001	2.4843	48	1985	2.5118
9	1993	2.5222	49	2005	2.5108
10	2001	2.5035	50	1969	3.0703
11	1982	3.1035	51	1972	3.0694
12	1974	3.1047	52	1984	2.483
13	1989	2.5413	53	1996	2.4724
14	1989	2.4786	54	1982	3.116
15	1986	2.4923	55	1982	3.1009
16	1982	3.0563	56	1987	2.5218
17	1988	2.4533	57	1999	2.4892
18	1973	3.1142	58	1988	2.4492
19	1964	3.1085	59	1987	2.5223
20	1972	3.1025	60	1994	2.5136
21	1981	3.0091	61	1979	3.1048
22	1991	2.5374	62	1988	2.5082
23	1986	2.5418	63	1999	2.5111
24	1978	3.1921	64	1995	2.4893
25	2006	2.4812	65	1970	3.1001
26	1982	3.1121	66	2005	2.4979
27	1980	3.1286	67	1996	2.5063
28	1975	3.1243	68	1977	3.1096
29	1996	2.5083	69	1989	2.4948
30	2005	2.5084	70	1975	3.1283
31	1995	2.5006	71	1983	2.5272
32	1978	3.1169	72	1973	3.0811
33	1982	3.096	73	1995	2.4991
34	1959	3.1042	74	1978	3.1055
35	1980	3.0782	75	1989	2.5331
36	1989	2.4955	76	1982	3.1122
37	1990	2.5159	77	1986	2.4782
39	1993	2.5036	79	2003	2.5107
40	2001	2.4992	80	1994	2.517
Penny #	Year	Mass (g)	Penny #	Year	Mass (g)
81	1987	2.4966	121	1994	2.4963
82	1984	2.5722	122	1992	2.5107
83	1990	2.5288	123	1977	3.1174
84	1984	2.5001	124	2003	2.4819
85	1973	3.0682	125	1968	3.0723
86	1991	2.5163	126	2000	2.4884
87	2000	2.4757	127	1996	2.4955
88	1983	2.5606	128	1998	2.4758
89	1990	2.5261	129	1974	3.0957
90	1988	2.4743	130	2001	2.4866
91	1990	2.4639	131	1982	3.115
92	2001	2.5159	132	1979	3.0664
93	1986	2.5069	133	1988	2.9095
94	1989	2.4943	134	1992	2.4875
95	1996	2.4903	135	1975	3.1285
96	1975	3.1156	136	1984	2.5368
97	1986	2.5199	137	1963	3.0583
98	1980	3.1549	138	1994	2.5095
99	1990	2.4611	139	1998	2.5221
100	1997	2.5042	140	1980	3.0947
101	1980	3.0841	141	1994	2.4963
102	2005	2.4948	142	1992	2.5107
103	1982	3.0996	143	1977	3.1174
104	1996	2.5038	144	2003	2.4819
105	1993	2.4829	145	1968	3.0723
106	1988	2.4724	146	2000	2.4884
107	1984	2.5723	147	1996	2.4955
108	1995	2.5191	148	1998	2.4758
109	1990	2.5269	149	1974	3.0957
110	1979	3.1068	150	2001	2.4866
111	1999	2.489	151	1982	2.115
112	1995	2.4901	152	1979	3.0664
113	1984	2.5202	153	1988	2.5095
114	1988	2.5154	154	1992	2.4875
115	1983	2.5108	155	1975	2.1285
116	1974	3.0823	156	1984	2.5368
117	1991	2.4502	157	1963	3.0583
119	1985	2.4657	159	1998	2.5221
120	1963	3.091	160	1980	3.0947

Calculations and Graphs:

Constants Used in Calculations:

Qcrit (n=20) at 90% confidence = 0.300

This value was found on “Q” test table provided by Dr. Schug. The table was adapted from D.B. Rorabache, Anal. Chem., 63 (1981) 139.

ttable (DOF = ∞) at 99% confidence level = 2.576

Q-Test Calculations for Individual Data

Qcalc = Gap/Range

Gap (low) = 2.4657 – 2.4502 = 0.0155g

Gap (high) = 3.1068 – 3.0996 = 0.007g

Range (high-low) = 3.1068 – 2.4502 = 0.6566g

Qcalc(low) = 0.0155/0.6566 = 0.0236

Qcalc(low) < Qcrit

Qcalc(high) = 0.007/0.6566 = 0.11

Qcalc(high) < Qcrit

Q calculated in both cases (high and low) is less than Q critical (table), so we retain both values.

Mean of individual data=2.6479 g

Mean = ∑xi/n

∑ xi = sum of measured values

n = number of measurements

Standard deviation (s)=0.2648 g

Standard Deviation

Table 3: Low and High Frequency Table for Class Data

Range (g) low distribution	# of Pennies	Range (g) high distribution	# of Pennies
2.100 to 2.325	2	2.800 to 2.825	0
2.325 to 2.350	0	2.825 to 2.850	0
2.350 to 2.375	0	2.850 to 2.875	0
2.375 to 2.400	0	2.875 to 2.900	0
2.400 to 2.425	0	2.900 to 2.925	1
2.400 to 2.425	0	2.900 to 2.925	1
2.425 to 2.450	2	2.925 to 2.950	0
2.450 to 2.475	13	2.950 to 2.975	0
2.475 to 2.500	40	2.975 to 3.000	0
2.500 to 2.525	37	3.000 to 3.025	1
2.525 to 2.550	10	3.025 to 3.050	1
2.550 to 2.575	3	3.050 to 3.075	11
2.575 to 2.600	0	3.075 to 3.100	13
2.600 to 2.625	0	3.100 to 3.125	21
2.625 to 2.650	0	3.125 to 3.150	3
2.650 to 2.675	0	3.150 to 3.175	1
2.675 to 2.700	0	3.175 to 3.200	1
2.700 to 2.725	0	3.200 to 3.225	0
2.725 to 2.750	0	3.225 to 3.250	0
2.7250 to 2.775	0	3.250 to 3.275	0
2.775 to 2.800	0	3.275 to 3.300	0

Penny Distribution

Figure 2: High Frequency of Class Data
Where the x-axis is the range in grams, and the y-axis is the number of pennies.

Penny Distribution 2

Mean and Standard Deviation Calculations

Individual Data

While performing the experiment, it became evident that there was a range of weights that can be expressed on table 1. When we calculated the Q-test, it revealed that the two highest and two lowest values were not outliers. By performing these calculations we can visibly see a difference between the highest and lowest values. After dividing these values by the range, we calculate the critical value. After calculating the critical value, it was determined that no values should be disregarded because the calculated Q-test value was less than the provided critical value.

The mean of pennies weights was 2.6479. This means that the masses were relatively close. The mean is the average weight of all the pennies in the experiment. The standard deviation is calculated to show how many of the pennies fell near the mean weight for the penny. The standard deviation was 0.2648. The standard deviation being low means that most of the pennies masses were close to the mean.

Results

After reviewing the complied data from the class, it was evident that there were a large majority of the masses were around 3 grams. If you look at the histograms ( Figure 1 and Figure 2), you can see the frequency distribution for the masses of the pennies. In order to tell completely analyze the data, more tests and calculation were done.

The mean and the standard deviation of the low frequency distribution of penny’s weights were calculated. The mean of the low frequency was calculated to be 2.493g and the standard deviation to be 0.0569g. The standard deviation being low means that more pennies were weighed, in which we were able to obtain more data. This means that because there is more data, the mean and the standard deviation are closer to the actual value.

The mean and the standard deviation for the high frequency distribution of penny’s weights were also calculated. The mean calculated was 3.092g and the standard deviation was 0.0385g. This puts the measurements in the range of 3.0535g to 3.1305g for the high frequency.

By calculating the two frequencies, it became evident that a T-test needed to be done to verify the data. First, the pooled standard deviation (S pool) is calculated. This was needed to plug into the T calc equation. After calculating a T-value, it was compared that value to the t-table value. Because T table was less than T calc at a 99% confidence, it can be assumed that there are two distinct distributions.

Conclusions:

It can be concluded that there is a relation between the pennies masses and the year minted. I predicted that pennies that are more newly produced would weigh less than pennies produced many years ago. This seemed to hold true while analyzing our data.

By looking at both individual and class datum, we can see two distinct average masses. In both, individual data and class data the mean was calculated, along with standard deviation, to indicate how far the measurements range is. Because the individual data’s standard deviation was higher this means that it isn’t as precise, due to less measurements. By adding more measurements, the standard deviation decreased. When I calculated the Q-test it led me to believe that no values were to be discarded in the datum. The reason some of the penny’s weights were so far off from the mean is most likely because of being measured incorrectly (human error).

I was able to conclude that the majority of that data is closer to the mean in the class data than in comparison to the individual datum. This is evident because you can slightly tell there is a Gaussian distribution in figures 1 and 2. This just shows that the measurements were replicated enough times to account for a random error, if any. When I compared the t-test datum more closely, I was able to conclude that there are two distinct distributions. The confidence interval was 99%.