Statistical analyses

For comparison with another sample of prescriptions claims data, DINs were grouped into categories defined by the American Hospital Formulary Listing Pharmacologic-Therapeutic Classification Names (www.ashp.org).

The information on the prescription claim was compared with the electronic claim that was sent to the ODB. Errors in the coding and discrepancies between the prescription and the cheap medications that were actually dispensed were evaluated. It was only possible to identify coding errors (the sensitivity, specificity, and positive and negative predictive values could not be estimated). As such, the true error rate in the ODB database was estimated using a binomial proportion and 95% CI.

Logistic regression was used to test the association between coding errors and the location, owner affiliation, and productivity of each pharmacy. Because of the small number of coding errors, logistic regression can underestimate the likelihood of rare events. To address this problem, logistic regression models were also performed using rare events logistic regression (http://gking.har-vard.edu/preprints.shtml#0s). In this case, ‘rare’ was defined as binary dependent variables with dozens to thousands of times fewer events than ‘nonevents’. A pharmacy’s productivity was defined as the annual number of prescriptions per pharmacy divided by the annual cumulative total hours of work done by pharmacists and assistants.