Abortion Under-Reporting on the National Survey of Family Growth in Two Plots
After more than a year, I am still trying to get my analysis work started.
The subtitle of this blog is “a blog of independent thinking and evidence-based inquiry,” which raises the question, where is the evidence-based inquiry?
This blog was supposed to feature my research and analysis in reproductive responsibility. However, 2023 came and went and I have yet to show anything for my work.
The issue is that a primary research variable for inquiry into reproductive responsibility is the prevalence of unintended pregnancies, and my primary instrument for research and analysis, the National Survey of Family Growth (NSFG), suffers from a large amount of abortion under-reporting. (Lindberg et al. 2020; Desai et al. 2021; “Appendix 2: Topic-Specific Notes for 2017-2019, NSFG User’s Guide” 2021)
Considering that a large proportion of unintended pregnancies end in induced abortion, if I just launched into my analysis without addressing the abortion under-reporting, then my statistics would greatly under-count the number of unintended pregnancies.
I am working on a fix to the abortion under-reporting that is a standard approach often done in survey statistics in which the analysis weights of the survey responses data set are adjusted such that estimates based on the data set match values known from external sources. This is often called “post-stratification,” “calibration,” or “raking.”1
My main frustration is that I underestimated how much time this weights adjustment process would take and overestimated how much time I would have to work on it while simultaneously working a full-time job. Here I am more than a year later, and I have yet to actually start my analysis work and thus to learn anything from my work – and learning new information is the actually interesting part of doing research.
What I have done is spent more than a hundred hours on basically tedious tasks. This has to be done before I can move on to the interesting work, but as a result of this, not only has my analysis been delayed, but my writing has slowed to a halt, and I am not doing nearly as much reading as I would like.
There is not much I can do except keep chipping away at it, but in an effort to get my writing restarted, I thought I might summarize the problem on which I am working in a short article.
Two Plots: Live Births versus Induced Abortions
The National Survey of Family Growth (NSFG) is a survey administered by the National Center of Health Statistics (NCHS) and is the United States’ premiere fertility survey, providing insight into pregnancy intent, frequency, and outcome, among other topics. The National Vital Statistics System (NVSS) is also administered by the NCHS and collects birth and death certificate data from the several states of the United States.
The NSFG is a survey given to a randomized sample of between 8,000 to 23,000 respondents, depending on the particular iteration of the survey. The NVSS is effectively a census of all live births in the United States.
If the NSFG is accurate, estimates from the NSFG should match the census counts from the NVSS. In Figure 1, we can see that this is the case.
The X’s in Figure 1 represent the number of live births in the United States in a given year taken from the NVSS, and the brackets with a circle in the middle represent 95% confidence intervals for estimates of live births in a given year calculated from the NSFG. As we can see, most of the 95% confidence intervals from the NSFG cover the corresponding year’s NVSS count of live births.
The various colors of the brackets in Figure 1 represent which iteration of the NFSG the estimates are taken from. Based on my review, the NSFG produces accurate estimates of live births for the five to ten years before each survey iteration. Birth estimates become less reliable for years father back before the survey because more and more birth mothers have either died or aged out of the target population of the survey the longer the time period between birth and survey.
Cycles 1 and 2 of the NSFG are not included in Figure 1 because they only surveyed women who had ever been married. These iterations missed all of the unintended pregnancies that occur among never-married women and so are not suitable for the study of reproductive responsibility.
With a few aberrations, we can see that the NSFG does a decent job with live birth estimation. The circles in Figure 1 tend to follow the X’s, and the brackets usually contain the X’s.
On the other hand, the NSFG does not do a good job of estimating the number of induced abortions per year in the United States. While data on abortions in the United States are quite poor generally, the X’s in Figure 2 are taken from the Guttmacher Institute’s Abortion Provider Census (APC), which typically reports the greatest number of abortions among abortion surveys and probably has the closest estimates to the true number.
As we can see, the estimates of the number of induced abortions based on the NSFG are routinely less than half the number reported by the APC.
The goal of my weight adjustment project is to make the estimates of induced abortion from the NSFG match external counts from the APC by increasing the weights of those respondents who report one or more induced abortions. This is not, of course, done as a brute force arithmetic adjustment. Care has to be taken so that other estimates, such as the total number of women in the United States and the total numbers of births by various demographic categories, remain accurate.
There is more subtlety to this procedure than I originally anticipated such that it is not just a matter of plugging the NSFG data into preexisting software, running standard post-stratification and raking functions. I therefore am writing some custom programs. Other tasks – such as going through the U.S. Census Bureau website to find the appropriate counts – turned out to be more arduous than I expected. Ultimately this process involves a lot of trial and error – trying something, seeing what works and what does not, making an adjustment, and trying again.
Once the procedure is completed, there is still the risk of there being some bias in the adjusted estimates if those respondents who report their abortions differ substantially in properties of interest from those respondents who do not report their abortions. This is an inevitable consequence of non-reporting and cannot be helped. After weight adjustment, at least the estimates of total number of induced abortions and of live births equal external sources.
I will not go too much here in my blog into actual approach I am using for weight adjustment, because I want to leave open the possibility of trying to get an academic paper out of this work and because such discussion would probably be too technical to be of interest to a general audience anyway.
Once I am finished, however, I will at least make the weights themselves available here via my blog in the small chance that someone out there will find them helpful for their own work.
Citations
Footnote
This latter term “raking” actually refers to a specific technique for handling multiple dimensions in this process.↩︎