How to Update a Web Site Using Git

Here's markdown file and PDF describing how to set up git to automatically update a web site.


New York Times Article on Verbal Autopsy

This is a nice article on death registration and verbal autopsy - very high level overview for lay readers!

Although the openVA Team is not mentioned, we work closely with many of the organizations and people mentioned or quoted in this article.

A Door-to-Door Effort to Find Out Who Died Helps Low-Income Countries Aid the Living


Papers in Annual Meeting of the Population Association of America (PAA) 2022


New in Global Health Action

Chandramohan, D., E. Fottrell, J. Leitao, E. Nichols, S. J. CLARK, C. Alsokhn, D. C. Munoz, C. AbouZahr, A. Di Pasquale, R. Mswia, E. Choi, F. Baiden, J. Thomas, I. Lyatuu, Z. Li, P. Larbi-Debrah, Y. Chu, S. Cheburet, O. Sankoh, A. M. Badr, D. M. Fat, P. Setel, R. Jakob, D. de Savigny (2022). Estimating Causes Of Death Where There Is No Medical Certification: Evolution And State of The Art Of Verbal Autopsy. Global Health Action.  DOI ]


Over the past 70 years, significant advances have been made in determining the causes of death in populations not served by official medical certification of cause at the time of death using a technique known as Verbal Autopsy (VA). VA involves an interview of the family or caregivers of the deceased after a suitable bereavement interval about the circumstances, signs and symptoms of the deceased in the period leading to death. The VA interview data are then interpreted by physicians or, more recently, computer algorithms, to assign a probable cause of death. VA was originally developed and applied in field research settings. This paper traces the evolution of VA methods with special emphasis on the World Health Organization’s (WHO)’s efforts to standardize VA instruments and methods for expanded use in routine health information and vital statistics systems in low- and middle-income countries (LMICs). These advances in VA methods are culminating this year with the release of the 2022 WHO Standard Verbal Autopsy (VA) Toolkit. This paper highlights the many contributions the late Professor Peter Byass made to the current VA standards and methods, most notably, the development of InterVA, the most commonly used automated computer algorithm for interpreting data collected in the WHO standard instruments, and the capacity building in low- and middle-income countries (LMICs) that he promoted. This paper also provides an overview of the methods used to improve the current WHO VA standards, a catalogue of the changes and improvements in the instruments, and a mapping of current applications of the WHO VA standard approach in LMICs. It also provides access to tools and guidance needed for VA implementation in Civil Registration and Vital Statistics Systems at scale.


New in Global Health Action

Herbst, K., S. Juvekar, M. Jasseh, Y. Berhane, N. T. K. Chuk, J. Seeley, O. Sankoh, S. J. CLARK and M. Collinson (2022). Health And Demographic Surveillance Systems In Low- And Middle-income Countries: History, State of The Art And Future Prospects. Global Health Action. [ DOI ]


Health and Demographic Surveillance Systems (HDSS) have been developed in several low- and middle-income countries (LMICs) in Africa and Asia. This paper reviews their history, state of the art and future potential and highlights substantial areas of contribution by the late Professor Peter Byass.

Historically, HDSS appeared in the second half of the twentieth century, responding to a dearth of accurate population data in poorly resourced settings to contextualise the study of interventions to improve health and well-being. The progress of the development of this network is described starting with Pholela, and progressing through Gwembe, Balabgarh, Niakhar, Matlab, Navrongo, Agincourt, Farafenni, and Butajira, and the emergence of the INDEPTH Network in the early 1990’s

The paper describes the HDSS methodology, data, strengths, and limitations. The strengths are particularly their temporal coverage, detail, dense linkage, and the fact that they exist in chronically under-documented populations in LMICs where HDSS sites operate. The main limitations are generalisability to a national population and a potential Hawthorne effect, whereby the project itself may have changed characteristics of the population.

The future will include advances in HDSS data harmonisation, accessibility, and protection. Key applications of the data are to validate and assess bias in other datasets. A strong collaboration between a national HDSS network and the national statistics office is modelled in South Africa and Sierra Leone, and it is possible that other low- to middle-income countries will see the benefit and take this approach.


New in BMC Public Health

Houle, B., C.W. Kabudula, A.M. Tilstra, S.A. Mojola, E. Schatz, S.J. CLARK, N. Angotti, F.X. Gómez-Olivé, and J. Menken (2022). Twin epidemics: The Effects of HIV and Systolic Blood Pressure on Mortality Risk in Rural South Africa, 2010-2019. BMC Public Health. [ DOI ]


Background. Sub-Saharan African settings are experiencing dual epidemics of HIV and hypertension. We investigate effects of each condition on mortality and further examine whether HIV and hypertension interact in determining mortality.

Methods. Data come from the 2010 Ha Nakekela population-based survey of individuals ages 40 and older (1,802 women; 1,107 men) nested in the Agincourt Health and socio-Demographic Surveillance System in rural South Africa, which provides mortality follow-up from population surveillance until mid-2019. Using discrete-time event history models stratified by sex, we assessed differential mortality risks according to baseline measures of HIV infection, HIV-1 RNA viral load, and systolic blood pressure.

Results. During the 8-year follow-up period, mortality was high (477 deaths). 37% of men (mortality rate 987.53/100,00, 95% CI: 986.26 to 988.79) and 25% of women (mortality rate 937.28/100,000, 95% CI: 899.7 to 974.88) died. Over a quarter of participants were living with HIV (PLWH) at baseline, over 50% of whom had unsuppressed viral loads. The share of the population with a systolic blood pressure of 140mm Hg or higher increased from 24% at ages 40-59 to 50% at ages 75-plus and was generally higher for those not living with HIV compared to PLWH. Men and women with unsuppressed viral load had elevated mortality risks (men: adjusted odds ratio (aOR) 3.23, 95% CI: 2.21 to 4.71, women: (OR 2.05, 95% CI: 1.27 to 3.30). There was a weak, non-linear relationship between systolic blood pressure and higher mortality risk. We found no significant interaction between systolic blood pressure and HIV status for either men or women (p>0.05).

Conclusions. Our results indicate that HIV and elevated blood pressure are acting as separate, non-interacting epidemics affecting high proportions of the older adult population. PLWH with unsuppressed viral load were at higher mortality risk compared to those uninfected. Systolic blood pressure was a mortality risk factor independent of HIV status. As antiretroviral therapy becomes more widespread, further longitudinal follow-up is needed to understand how the dynamics of increased longevity and multimorbidity among people living with both HIV and high blood pressure, as well as the emergence of COVID-19, may alter these patterns.


I discovered something new (to me) that will be useful. Elements of the vector defined by the perpendicular projection of a vector \( \vec{p} = \{p_1,p_2\} \) onto the line \( y=x \) are the arithmetic mean of \( p_1 \) and \( p_2 \): \begin{align} \mbox{proj}_\vec{p}\hat{e} &= \frac{\vec{p} \cdot \hat{e}}{\hat{e} \cdot \hat{e}} \hat{e} \\ &= \frac{p_1e + p_2e}{e^2 + e^2} \hat{e} \\ &= \frac{p_1 + p_2}{2e} \hat{e} \\ &= \left\{\frac{p_1 + p_2}{2},\frac{p_1 + p_2}{2}\right\} \\ \end{align} where '\( \cdot \)' is the dot product, \(\hat{e} = \{e_1,e_2\} \) is the unit vector in the direction of \( y=x \), and \(e = e_1 = e_2 \).

I noticed something else interesting (also new to me). The projection of a point onto every line through the origin forms a circle. In the image below, the red point (2,4.5) is projected onto a set of the lines through the origin and each is marked with a black dot and connected with a red line.


Published in Annals of Epidemiology

Norris Turner, A., D. Kline, A. Norris,  W.G. Phillips, E. Root, J. Wakefield, Z. Li, S. Lemeshow, M. Spahnie, A. Luff, Y. Chu, M.K. Francis, M. Gallo, P. Chakraborty, M. Lindstrom, G. Lozanski,  W. Miller, S.J. CLARK (2021). Prevalence of Current and Past COVID-19 in Ohio Adults. Annals of Epidemiology. [ DOI ]


Purpose. To estimate the prevalence of current and past COVID-19 in Ohio adults.

Methods. We used stratified, probability-proportionate-to-size cluster sampling. During July 2020, we enrolled 727 randomly-sampled adult English- and Spanish-speaking participants through a household survey. Participants provided nasopharyngeal swabs and blood samples to detect current and past COVID-19. We used Bayesian latent class models with multilevel regression and poststratification to calculate the adjusted prevalence of current and past COVID-19. We accounted for the potential effects of non–ignorable non–response bias.

Results. The estimated statewide prevalence of current COVID-19 was 0.9% (95% credible interval: 0.1%–2.0%), corresponding to ∼85,000 prevalent infections (95% credible interval: 6,300–177,000) in Ohio adults during the study period. The estimated statewide prevalence of past COVID-19 was 1.3% (95% credible interval: 0.2%–2.7%), corresponding to ∼118,000 Ohio adults (95% credible interval: 22,000–240,000). Estimates did not change meaningfully due to non–response bias.

Conclusions. Total COVID-19 cases in Ohio in July 2020 were approximately 3.5 times as high as diagnosed cases. The lack of broad COVID-19 screening in the United States early in the pandemic resulted in a paucity of population-representative prevalence data, limiting the ability to measure the effects of statewide control efforts.


In July 2020 a large group of colleagues at The Ohio State University collaborated with the Ohio State Department of Health to conduct a probability-based sample survey representative of adults living in Ohio in order to estimate state-level CV19 prevalence of current and past infections.

This article describes the results of the survey for a public health audience. We developed a new Bayesian poststratification method to produce estimates from the data - described in PNAS.

Abigail Norris Turner led the overall study conducted by a large group of collaborators at The Ohio State University and Ohio State Department of Health.


New working paper on arXiv

Li, Z. R., Z. Wu, I. Chen, and S. J. CLARK (2021). Bayesian Nested Latent Class Models for Cause-of-Death Assignment using Verbal Autopsies Across Multiple Domains. arXiv Preprint arXiv:2112.12186. [ PDF ]


Understanding cause-specific mortality rates is crucial for monitoring population health and designing public health interventions. Worldwide, two-thirds of deaths do not have a cause assigned. Verbal autopsy (VA) is a well-established tool to collect information describing deaths outside of hospitals by conducting surveys to caregivers of a deceased person. It is routinely implemented in many low- and middle-income countries. Statistical algorithms to assign cause of death using VAs are typically vulnerable to the distribution shift between the data used to train the model and the target population. This presents a major challenge for analyzing VAs as labeled data are usually unavailable in the target population. This article proposes a Latent Class model framework for VA data (LCVA) that jointly models VAs collected over multiple heterogeneous domains, assign cause of death for out-of-domain observations, and estimate cause-specific mortality fractions for a new domain. We introduce a parsimonious representation of the joint distribution of the collected symptoms using nested latent class models and develop an efficient algorithm for posterior inference. We demonstrate that LCVA outperforms existing methods in predictive performance and scalability. Supplementary materials for this article and the R package to implement the model are available online.


This appeared on my Twitter feed recently: Demography Abandons Its Core by Ron Lee in 2001. Unfortunately this is still highly relevant, and those of us interested in formal demography as a recognizable field need to do something! The Formal Demography Working Group is being formed as a vehicle for doing that - please join if you are interested!


Today Mary Shenk and I are discussants at the 16th De Jong Lecture in Social Demography at Penn State. Hans Peter-Peter Kohler is the featured speaker. Link to slides.


For the past 6-7 years my colleagues and I who work on verbal autopsy methods have been developing new statistical methods to automate identification of a cause of death using verbal autopsy records. Along the way we have developed a range of open source software tools to ensure that the methods are transparent and available to anyone who wants to use them. Recently we submitted a paper and posted a preprint on the openVA Toolit. This is a suite of open source software that can be used to apply a variety of algorithms to verbal autopsy data. Additional software that works with openVA - e.g. the openVA Pipeline that fully automates cause-coding in CRVS settings - and links to the GitHub repositories where software is maintained are available at


Published open-access in PNAS:

Kline, D., Z. Li, Y. Chu, J. Wakefield, W.C. Miller, A. Norris Turner, and S. J. CLARK (2021). Estimating Seroprevalence of SARS-CoV-2 in Ohio: A Bayesian Multilevel Poststratification Approach with Multiple Diagnostic Tests. Proceedings of the National Academy of Sciences 118(26), e2023947118. [ DOI ]


Globally, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected more than 59 million people and killed more than 1.39 million. Designing and monitoring interventions to slow and stop the spread of the virus require knowledge of how many people have been and are currently infected, where they live, and how they interact. The first step is an accurate assessment of the population prevalence of past infections. There are very few population-representative prevalence studies of SARS-CoV-2 infections, and only two states in the United States—Indiana and Connecticut—have reported probability-based sample surveys that characterize statewide prevalence of SARS-CoV-2. One of the difficulties is the fact that tests to detect and characterize SARS-CoV-2 coronavirus antibodies are new, are not well characterized, and generally function poorly. During July 2020, a survey representing all adults in the state of Ohio in the United States collected serum samples and information on protective behavior related to SARS-CoV-2 and coronavirus disease 2019 (COVID-19). Several features of the survey make it difficult to estimate past prevalence: 1) a low response rate; 2) a very low number of positive cases; and 3) the fact that multiple poor-quality serological tests were used to detect SARS-CoV-2 antibodies. We describe a Bayesian approach for analyzing the biomarker data that simultaneously addresses these challenges and characterizes the potential effect of selective response. The model does not require survey sample weights; accounts for multiple imperfect antibody test results; and characterizes uncertainty related to the sample survey and the multiple imperfect, potentially correlated tests.


In July 2020 a large group of colleagues at The Ohio State University collaborated with the Ohio State Department of Health to conduct a probability-based sample survey representative of adults living in Ohio in order to estimate state-level CV19 prevalence of current and past infections. Conducting the survey at that time presented many challenges, including a large non-response rate and an array of tests whose performance characteristics were poorly understood (we used all that we could), and very few positive results from any test. These two issues, and others including the sampling design, presented a particular challenge for analysis and led us to develop a new Bayesian/poststratification approach to estimate state-wide prevalence.

There are still few representative sample surveys of CV19 biomarkers. Most other approaches suffer from the possibility of very large, consequential bias. This paper should be useful for anyone analyzing the results from a similar survey. Although we cannot share the data easily, we have made R code available to replicate the analysis: Bayes Prevalence.

Abigail Norris Turner led the overall study conducted by a large group of collaborators at The Ohio State University and Ohio State Department of Health.

In a truly team effort, Dave Kline, Richard Li, Yue Chue, Jon Wakefield, Bill Miller, Abigail Norris Turner, and me conducted the analysis and developed the overall approach. It was an enjoyable and productive experience working with this team.


Today I discovered that the OSU College of Arts and Sciences web page has a write-up of the Joan Huber award recipients for this 2021, including me.


APHRC - the African Population Health Research Center - in Nairobi is seeking a consultant demographer for six months to lead the redesign of the Nairobi Urban Health and Demographic Surveillance System Site (NUHDSS). Full details available here.


Today and tomorrow the openVA Team is presenting a training workshop for the CHAMPS project and our Data for Health Initiative colleagues in Thailand.


A long-in-the-making collaboration with UNICEF produced its first non-academic product recently: Subnational Under-five Mortality Estimates, 1990–2019. This work grew out of a small collaboration with Jon Wakefield at the University of Washington, see small-area estimates. Jon grew the group working on it and together with Jessica Godwin saw it through to this. Congratulations to everyone involved!


I gave a talk today at the 2021 'Berlin Demography Days': Global Population Studies in the 21st Century: Priorities Challenges - Mortality.



I learned today that the Social and Behavioral Sciences division of the School of Arts and Sciences at The Ohio State University has chosen me as a recipient of the Joan N. Huber Faculty Fellow Award. A brief description here.

New working paper on arXiv

I posted a slightly edited version of a paper titled "Health and Demographic Surveillance Systems and the 2030 Agenda: Sustainable Development Goals" on arXiv. This paper was invited at a UN Population Division experts' group meeting titled 'Strengthening the Demographic Evidence Base for the Post-2015 Development Agenda' that happened in New York, USA October 5-6, 2015.


The health and demographic surveillance system (HDSS) is an old method for intensively monitoring a population to assess the effects of healthcare or other population-level interventions - often clinical trials. The strengths of HDSS include very detailed descriptions of whole populations with frequent updates. This often provides long time series of accurate population and health indicators for the HDSS study population. The primary weakness of HDSS is that the data describe only the HDSS study population and cannot be generalized beyond that.

The 2030 agenda is the ecosystem of activities - many including population-level monitoring - that relate to the United Nations (UN) Sustainable Development Goals (SDG). With respect to the 2030 agenda, HDSS can contribute by: (1) continuing to conduct cause-and-effect studies; (2) contributing to data triangulation or amalgamation initiatives; (3) characterizing the bias in and calibrating big data; and contributing more to the rapid training of data-oriented professionals, especially in the population and health fields.


Commentary in PNAS: Monitoring epidemics: Lessons from measuring population prevalence of the coronavirus with Abigail Norris Turner. We highlight the need to improve response rates and to prepare a robust measurement capability to be ready for the next pandemic. DOI.


Article out today in PLoS One

Linking the timing of a mother's and child's death: Comparative evidence from two rural South African population-based surveillance studies, 2000–2015 by Brian Houle, Chodziwadziwa W. Kabudula, Alan Stein, Dickman Gareta, Kobus Herbst, and Samuel J. Clark. DOI.


Background. The effect of the period before a mother's death on child survival has been assessed in only a few studies. We conducted a comparative investigation of the effect of the timing of a mother's death on child survival up to age five years in rural South Africa.

Methods. We used discrete time survival analysis on data from two HIV-endemic population surveillance sites (2000–2015) to estimate a child's risk of dying before and after their mother's death. We tested if this relationship varied between sites and by availability of antiretroviral therapy (ART). We assessed if related adults in the household altered the effect of a mother's death on child survival.

Findings. 3,618 children died from 2000–2015. The probability of a child dying began to increase in the 7–11 months prior to the mother's death and increased markedly in the 3 months before (2000–2003 relative risk = 22.2, 95% CI = 14.2–34.6) and 3 months following her death (2000–2003 RR = 20.1; CI = 10.3–39.4). This increased risk pattern was evident at both sites. The pattern attenuated with ART availability but remained even with availability at both sites. The father and maternal grandmother in the household lowered children's mortality risk independent of the association between timing of mother and child mortality.

Conclusions. The persistence of elevated mortality risk both before and after the mother's death for children of different ages suggests that absence of maternal care and abrupt breastfeeding cessation might be crucial risk factors. Formative research is needed to understand the circumstances for children when a mother is very ill or dies, and behavioral and other risk factors that increase both the mother and child's risk of dying. Identifying families when a mother is very ill and implementing training and support strategies for other members of the household are urgently needed to reduce preventable child mortality.


There will be occasional updates here. For now, the news is that I finally finished this web site after I was interrupted for a year by COVID-19. It's up to date and has all the basic content I had planned. The site is written in plain HTML with a simple cascading style sheet. It's straight from 1995, but it's super easy to modify/maintain/augment with a text editor, and with a shell script to upload everything, keeping the site up to date will be easy and does not require any expensive or takes-time-to-learn software or rely on third parties to fix things. :-)