Kappa values for the assessed standards between the two primary raters ranged from 0.72 to 0.92 (“substantial” to “almost perfect agreement”). Discussion
The increasing number, cost, and sophistication of diagnostic tests and their significant impact on patient care require that new tests undergo rigorous assessment before they are adopted into clinical practice. The present investigation demonstrates that most of the evaluated studies that assess diagnostic tests pertinent to pulmonary conditions do not conform to standard recommended methods for diagnostic test research. Canadian pharmacy levitra We observed in the 41 study articles major departures from standards for study design, reporting of diagnostic accuracy, and data analysis.
The observed flaws in study design were notable considering the importance of a rigorous investigational method for limiting bias and promoting the validity of study results. Of the 12 major standards applicable to all of the reviewed articles that pertained to the validity, reproducibility, or applicability of the study design, only 2 articles fulfilled all 12 standards and the median number of standards fulfilled was 6.
Of these methodologic standards, proper selection and application of a reference (“gold”) standard in diagnostic test research are fundamentally important in accurately measuring a new test’s discriminative properties. The reference standard should be the most definitive method for verifying the presence of the target disease considering the relative accuracy and feasibility of the alternative reference standards available. Uncertainty regarding the adequacy of the reference standard presents major problems for assessing new diagnostic tests because it introduces considerable bias and inflates the estimates of the evaluated test’s diagnostic accuracy in a manner that is difficult to adjust by post hoc techniques. Our review of the study articles indicated that only 76% of studies clearly described the reference standard used. Moreover, only 63% used a reference standard that could be considered definitive compared with other available tests that would have been feasible considering the articles’ study designs. Deficiencies in the reference standard employed by the evaluated articles could have been offset by sufficient patient follow-up to ensure that patients were not misclassified for their disease state. Most of the articles, however, did not provide information on patient follow-up.