Verbal autopsy procedures are widely used for estimating cause-specific mortality in areas
without medical death certification. Data on symptoms reported by caregivers along
with the cause of death are collected from a medical facility, and the cause-of-death distribution
is estimated in the population where only symptom data are available. Current
approaches analyze only one cause at a time, involve assumptions judged difficult or impossible
to satisfy, and require expensive, time consuming, or unreliable physician reviews,
expert algorithms, or parametric statistical models. By generalizing current approaches
to analyze multiple causes, we show how most of the difficult assumptions underlying existing
methods can be dropped. These generalizations also make physician review, expert
algorithms, and parametric statistical assumptions unnecessary. With theoretical results,
and empirical analyses in data from China and Tanzania, we illustrate the accuracy of
this approach. While no method of analyzing verbal autopsy data, including the more
computationally intensive approach offered here, can give accurate estimates in all circumstances,
the procedure offered is conceptually simpler, less expensive, more general, as
or more replicable, and easier to use in practice than existing approaches. We also show
how our focus on estimating aggregate proportions, which are the quantities of primary
interest in verbal autopsy studies, may also greatly reduce the assumptions necessary, and
thus improve the performance of, many individual classifiers in this and other areas. As a
companion to this paper, we also offer easy-to-use software that implements the methods
discussed herein.