Norm referenced or standardized evaluation tools are used when we want to measure an individual child's trait, skill or ability in comparison to a large population of children of a similar age or characteristics. A standardized test is normed on a large sample of the population. The population or group on whom the test is normed is of utmost importance. Those who are users of norm referenced or standardized evaluation tools/instruments want to be sure that there is adequate and representative sampling. It is important that the population sampled is large enough and representative of the group being assessed. Most test developers consider age, gender, race, parental education, socioeconomic status and geographical distribution in determining a national sample. For example, one would not want to use a test with diverse children in Florida that was normed on children who only lived in the northeast and were largely Caucasian female.
Buros Mental Measurement Yearbook published annually provides technical information (standardization data, reliability and validity) on tests. Consulting Buros can be helpful before purchasing an evaluation or assessment instrument. Note: The website does not report technical data.
Figure 1 shows a normal distribution (the bell shaped curve). The normal curve represents the distribution of a large number of human attributes including height, weight, and in this case test scores. The line in the center of the curve is the mean. The mean represents the average score in the distribution. The pattern shows that most children are average and that they (68%) cluster either above or below the mean.
Terminology Related to Standardized Assessment
Converted Scores  When we give a standardized assessment instrument, we get a raw score. This has little meaning to the person interpreting the score. Thus we statistically convert the scores in order to compare individuals to the population on which the test was standardized. Types of converted scores are: standard scores, percentiles, and age scores for infants and toddlers.
Standard Deviation  The standard deviation represents the spread of the distribution from the mean. It measures the distance scores depart from the mean. In Figure 1, 34.13% of the individuals are less than one standard deviation below and 34.13 % are less than one standard deviation above the mean. If we go two standard deviations below and two standard deviations above the mean we have taken into account 96% of all scores.
The eligibility criteria in Early Steps for a developmental delay are that the child must score 1.5 standard deviations below the mean in at least one of the following areas: cognition, physical/motor, communication, socialemotional or adaptive development.
Percentiles  Percentiles are converted scores which indicate the percentage of children in a norm group who received the same raw score as the test taker. As noted in Figure 2 a child who scores at the 50th percentile is scoring at the mean. Percentile should not be interpreted as percentage correct. For example, a child who scores at the 65th percentile is scoring above the mean on the test and is performing equal to 65 % of the children who took the test and worse than 35 % of the children who took the test.
Look at Figure 2: Percentiles in the Resource Bank. A toddler was given a screening test. The toddler scored in the 10th percentile in motor, the 15th percentile in social, the 3rd percentile in language, and the 12th percentile in cognitive. If you look at Figure 2, you can see that this toddler probably failed the screening and needs to be referred for evaluation. This toddler is scoring significantly below his age group peers. One can also see from the screening that this child has severe delays in language and her strength is in the social area.
