Standardized tests have the same standard for questions, administration, scoring and interpretation. For a test to be standard, all of these areas must be consistent. Standardized tests scores are either norm-referenced, which compares students with peers, or criterion-referenced, which determines how well students know a standard. For effective evaluation, standardized tests need to have reliability and validity.


Consistency is integral to standardized testing. Questions on standardized tests are usually objective, such as multiple choice or true and false. This allows for easy and efficient computer grading. Although not as common, short-answer or essay questions may be used. All students receive the same prompt on these standardized tests. Essay and short-answer questions require more time and manpower for grading. Trainers teach graders to have consistent grading using a standardized rubric. Along with consistent questions, the administration, scoring and interpretation of the test must be standardized.


One style of standardized test is norm-referenced. These tests compare a student with all the other students who took the same test. Most states use norm-referenced tests because they provide a clear ranking. For example, if a student receives a percentile score of 80 that means, he performed better than 80 percent of students taking that particular test. Many norm-referenced test also provide a grade-equivalent score. A score of 6.5 means the student scored as high as average sixth graders.


Another style of standardized tests is criterion-referenced. Criterion-referenced tests do not rank students against their peers; instead, they determine how well students meet a particular performance or established standard. These tests show a raw score and a score summary. If a student scored 7 out of 10 on a criterion-referenced test, this would indicate 70 percent mastery of material. Benchmark tests are criterion-referenced standardized tests that are used to evaluate how well students have mastered a particular performance standard.


A standardized test needs to be reliable. The term "reliability" refers to the construction of a test in which groups who take the test over time receive the same results as the test sample normed group. The less variance on test results over time the higher the reliability of a standardized test. Makers of standardized tests try to construct tests that lack errors. Standardized tests with major inconsistencies in test performance are not considered reliable measurements.


The term "validity" in standardized testing refers to students actually taking the test. A standardized test could be reliable in construction, but is not standard if there are discrepancies in taking the test. For a test to be valid, the instructions must be followed exactly. For example, if the instructions state the test should be administered in a two-hour block, a proctor who arbitrarily gave test-takers three hours would invalidate the test. Also, the conditions for the test should be similar to those for the sample norm group. If an air conditioner broke down and students took the test in a 90-degree-temperature classroom, the validity of the test would be questionable.