Part 15
Finally, Dr. Cole disputed the claims of proponents of the SFSTs that the studies regarding them have been published in peer review journals. The 1977 and 1981 field studies were published in technical reports by NHTSA, but those reports excluded the "methods and results" sections because they were thought to be too lengthy. Id. at 543. Cole concluded "[i]t is
difficult to see how the NHTSA could claim that the FST is accepted in the scientific community, when results of studies on the validation of the FST have never appeared in a scientific peer reviewed journal, which is a basic requirement for acceptance by the scientific community." Id. Cole concluded:
Because of its widespread use, the FST battery has been assumed to be a reliable and valid predictor of driving impairment. NHTSA has done little to dispel that assumption. Law enforcement cannot be blamed for its use
of the FST battery. Training documents refer to NHTSA reports and provide what appears to be supporting evidence for the validity of the FST battery. In addition, there is little doubt that individuals who have high BAC levels will have difficulty in performing the FST battery. However, what the law enforcement community and the courts fail to realize is that the FST battery may mislead the officer on the road to incorrectly judge individuals who are not impaired. The FST battery to be valid must discriminate accurately between the impaired and non-impaired driver.
critical reviewer to determine whether the officer's arrest decision was based on the SFSTs alone, or on the totality of the information available to the officer, including the results of the breath test. Thus, the studies were not controlled, and there were multiple variables that affected the ultimate decision. He concluded, therefore, that these "validation" studies were scientifically unacceptable.
25
NHTSA's own research on that issue . . . has not been subjected to peer review by the scientific community. In addition, a careful reading of the reports themselves provides support for the inadequacy of the FST battery. The reports include low reliability estimates for the tests, false arrest rates between 32 and 46.5 percent, and a field test of the FST that was flawed because the officers in many cases had breathalyzer results at the time of the arrest. NHTSA clearly ignored the printed recommendations of its own researchers in conducting that field study.
Id. at 546. (Emphasis in original).
Horn also introduced the affidavit of Joel P. Wiesen, Ph.D. Dr. Wiesen is an industrial psychologist with special expertise in experimental psychology, psychometrics and statistics. His experience includes more than ten years working with the Commonwealth of Massachusetts developing civil service examinations and an equal number of years as an independent consultant in the area of test development and validation. In addition, he is a published author of a mechanical aptitude test used nationwide. Although he is most familiar with written tests, he does have experience in the development of human performance tests. Def's. Reply Memo, Exh.6 at 1.
Dr. Wiesen reviewed the NHTSA 1977 Report, the 1981 Final Report, the 1983 Field Evaluation, the 1995 Colorado Validation Study, the undated Florida Validation Study, and the NHTSA student manual for the SFSTs. He was highly critical of these
26
studies, as the following summary illustrates :22
22 The information reported in the chart is found in Def's. Reply Memo, Ex.6 at 1-13.
1977 Report
1981 Report (Lab & Field Phases)
1983 Report
1995 Colo. Study
Fla. Study
1. In the lab 1. Serious 1. Report 1. Report 1. Report too
the HGN test was flaws include seriously describes incomplete to
administered 20% false flawed, does results of permit
using a chin positive not meet impaired meaningful
rest which evaluations of professional driving arrests evaluation.
facilitated intox.; very standards of from seven
making HGN high error testing Colorado law
observations. rates in community. enforcement
This was not reliability if organizations.
done in the using SFSTs to Report too
field. predict BAC. incomplete to draw any
conclusions
about the
validity of the test.
2. A single set 2. HGN test 2. Failure to 2. Methodology 2. Methodology
of data was used affected by monitor data results and not described,
to determine time of day, no collection by data sections and data
criterion score adjustment in officers. of report are regarding
and to evaluate scoring. Cannot tell if missing. methodology not
accuracy of decisions based provided in
test, which on SFSTs or report.
artificially prelim. breath
inflates test (PBT). estimate of
accuracy.
3. Tests are 3. Test/retest 3. Arrest 3. Data 3. Data
not age & gender reliability decisions made generated by incompletely
neutral, and rates very low. on PBT results "volunteer" described.
age/gender as well as officers--
differences can SFSTs. Not suggesting
affect ability possible to possible bias.
to perform tell
SFSTs. reliability of SFSTs.
27
4. In lab tests 4. Report 4. Authors 4. No officers were states testing fail to report monitoring of monitored to officers did the data from data collection insure correct not necessarily N.C. Test to verify performance of base decisions site–over 25% reporting tests, not done on results of of data for methodology. in field. SFSTs, making whole test. Officers merely validity reported suspect. results. 5. Test results 5. Authors 5. No 5. Results differ in admit field statistical unclear, statistically test data not tests conducted particularly significant appropriate for on data. because two respects statistical different depending on significance arrest time of day that testing, and standards used HGN test was could be (one for performed, yet biased. intoxication, test scoring did another for not account for impaired) difference in time of day test was administered. 6. The study 6. High error 6. SFSTs not was not peer rates. 28.6% of administered in reviewed, and subjects with standard would not have "legal" BAC fashion. been accepted if arrested, and offered. 50% of subjects w/ BAC > 0.10 not arrested. 7. Officers 7. Authors selected for acknowledge study not "extreme representative caution" needed of police in analyzing officers across data collected the board. in study. Accuracy of data suspect. 8. Authors reported that in field some officers forgot or ignored standardized procedure to administer SFSTs. |
28
Dr. Wiesen concluded his evaluation of the SFST reports with the following observation:
the studies give only a general indication of the level of potential validity of the tests as described in the NHTSA manual . . . . Rather than the five studies supporting each other, they evaluate somewhat different combinations of test content and test scoring. The differences are large enough to change the validity and accuracy of the tests. The older studies are probably less germane, due to the changes in test content and scoring over time. The reports for the newer studies are grossly inadequate. Given this, and in light of the specific critiques above (which are not exhaustive), I can only conclude that the field sobriety tests do not meet reasonable professional and scientific standards.
Id. at 12-13.
Harold P. Brull testified on behalf of Horn and supplied an affidavit as well. Mr. Brull is a licensed psychologist with many years experience consulting in connection with the design and implementation of procedures to measure human attributes, especially in employment settings. He has designed and evaluated tests and procedures measuring human characteristics for over twenty years. Def's. Reply Memo, Exh. 5 at 2.
Mr. Brull reviewed the NHTSA 1977 Report, the 1981 Final Report, the 1983 Field Evaluation, the 1995 Colorado Validation Study, the Florida Validation Study, and the NHTSA officer training manual. Among his general observations of these materials was the opinion that there was a complete absence of
29