Volume 6, Issue 10 e2022GH000696
Research Article
Open Access

Associations Between Surface Mining Airsheds and Birth Outcomes in Central Appalachia at Multiple Spatial Scales

Molly X. McKnight

Molly X. McKnight

Department of Geography, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Writing - original draft, Writing - review & editing, Visualization

Search for more papers by this author
Korine N. Kolivras

Corresponding Author

Korine N. Kolivras

Department of Geography, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Correspondence to:

K. N. Kolivras,

[email protected]

Contribution: Conceptualization, Writing - original draft, Writing - review & editing, Supervision, Funding acquisition

Search for more papers by this author
Lauren G. Buttling

Lauren G. Buttling

Department of Population Health Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Methodology, Formal analysis, Data curation, Writing - review & editing

Search for more papers by this author
Julia M. Gohlke

Julia M. Gohlke

Department of Population Health Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Conceptualization, Methodology, Formal analysis, Writing - review & editing, Supervision, Project administration, Funding acquisition

Search for more papers by this author
Linsey C. Marr

Linsey C. Marr

Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Conceptualization, Methodology, Validation, Formal analysis, Writing - review & editing, Supervision, Funding acquisition

Search for more papers by this author
Thomas J. Pingel

Thomas J. Pingel

Department of Geography, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Methodology, Software, Formal analysis, Writing - review & editing

Search for more papers by this author
Shyam Ranganathan

Shyam Ranganathan

Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA

Contribution: Methodology, Validation, Formal analysis, Writing - review & editing, Funding acquisition

Search for more papers by this author
First published: 08 October 2022


A considerable body of research exists outlining ecological impacts of surface coal mining, but less work has explicitly focused on human health, and few studies have examined potential links between health and surface coal mining at fine spatial scales. In particular, relationships between individual birth outcomes and exposure to air contaminants from coal mining activities has received little attention. Central Appalachia (portions of Virginia, West Virginia, Kentucky, and Tennessee, USA), our study area, has a history of resource extraction, and epidemiologic research notes that the region experiences a greater level of adverse health outcomes compared to the rest of the country that are not fully explained by socioeconomic and behavioral factors. The purpose of this study is to examine associations between surface mining and birth outcomes at four spatial scales: individual, Census tract, county, and across county-sized grid cells. Notably, this study is among the first to examine these associations at the individual scale, providing a more direct measure of exposure and outcome. Airsheds were constructed for surface mines using an atmospheric trajectory model. We then implemented linear (birthweight) and logistic (preterm birth [PTB]) regression models to examine associations between airsheds and birth outcomes, which were geocoded to home address for individual analyses and then aggregated for areal unit analyses, while controlling for a number of demographic variables. This study found that surface mining airsheds are significantly associated with PTB and decreased birthweight at all four spatial scales, suggesting that surface coal mining activities impact birth outcomes via airborne contaminants.

Key Points

  • Surface mining is associated with preterm birth and low birthweight in Central Appalachia at individual and aggregated spatial scales

  • Airsheds associated with mines were created using an atmospheric trajectory model and compared to birth data at the mother's address

  • This study is among the first to identify a relationship between surface mining and birth outcomes at the individual scale

Plain Language Summary

The human health impacts of surface coal mining are not well understood, particularly when considering the potential impacts on birth outcomes at the individual level. Central Appalachia, USA, which includes parts of Kentucky, Tennessee, Virginia, and West Virginia, has unexplained health concerns when compared to the rest of the country, and the region has a history of surface coal mining, making it an ideal study area to research these relationships. This research was conducted at the individual scale, as well as using Census tracts, county boundaries, and a large grid placed over the study area for comparison. Individual birth data including birthweight and length of gestation were received from each state's health department, along with characteristics of the mother including race/ethnicity and smoking status, so we could adjust for those factors. We estimated the movement of air contaminants from active surface mines, and then compared that exposure to birth outcomes using statistics. After adjusting for mother's characteristics, we found that preterm birth and decreased birthweight is associated with air moving from surface mines at all four spatial scales, suggesting that surface coal mining affects birth outcomes. Policy changes may help reduce the human health impacts of mining.

1 Introduction

Activities associated with surface mining of coal have noted ecological and human health impacts (Palmer et al., 2010). In particular, blasting to remove rock, hauling material by trucks, the creation of tailing ponds, and other activities associated with the removal of overburden to access coal seams and then the removal and transport of coal can result in human health risks through contaminated air, water, and soil (Krometis et al., 2017). A considerable body of research exists examining ecological impacts of surface mining (Palmer et al., 2010), but less work has examined how surface mining can potentially affect human health through airborne pathways. Specifically, minimal research has quantified the fine-scale impact of surface mining on birth outcomes, and this study responds to the call identifying the need for individual-scale studies.

The economy of Central Appalachia, USA (Figure 1), composed of 82 counties in Kentucky, Tennessee, Virginia, and West Virginia as defined by the Appalachian Regional Commission (2009), has long been driven by resource extraction, initially of timber with coal and natural gas extraction added later, making the region an ideal location to examine the potential human health impacts of surface mining. When compared to the rest of the United States, the region experiences excess morbidity, mortality, and adverse birth outcomes (e.g., Barnett et al., 2000; Borak et al., 2012; Esch & Hendryx, 2011; Halverson et al., 20022004; Hendryx, 2009; Hendryx & Zullig, 2009; Hendryx et al., 2008; Yao et al., 2012). These disparate health outcomes have been attributed to socioeconomic and behavioral factors, such as limited access to healthcare, poverty, low educational attainment, substance use disorder, and smoking (Behringer & Friedell, 2006; Blackley et al., 2012; Borak et al., 2012; Halverson & Bischak, 2008; Phillippi et al., 2014; Yao et al., 2012), but these measures do not fully account for the region's poor health outcomes, and spatial variations are not fully explained (Borak et al., 2012).

Details are in the caption following the image

The 82 counties of Central Appalachia as defined by the Appalachian Regional Commission (2009).

Underground mining activities with their related ecological and human health impacts dominated the region previously, but Central Appalachia's transition from underground to surface mining over the past 25 years has resulted in measurable increases in fine particulate matter downwind from surface mining sites (Ferrari et al., 2009; Palmer et al., 2010; Townsend et al., 2009; Zipper et al., 2011). Surface mining includes mountaintop removal, contour, area, highwall, and auger mining (United States Environmental Protection Agency, 2021). In addition to particulate matter, this type of mining involves the potential exposure to gaseous emissions such as nitrogen oxides (NOx) from mining equipment (e.g., diesel-fueled trucks, shovels, or loaders). However, this study focuses on particular matter because of its high potential impact on human health outcomes (Patra et al., 2016), such as birth outcomes (e.g., Ebisu & Bell, 2012; Stieb et al., 2012), which occur in the region at higher rates than in the rest of the United States (Driscoll & Ely, 2019).

We chose to examine surface mining exposures and birth outcomes because the gestational period is a relatively short exposure time frame that can be clearly defined and matched to land use changes occurring at similar temporal scales. Thus, these variables more directly capture the exposure-outcome relationship than other health outcomes that may have a significant lag between exposure and responses. Adverse birth outcomes include preterm birth (PTB; born <37 weeks of gestation; Centers for Disease Control and Prevention, 2019), low birth weight (LBW; born weighing <2,500 g; Bobak, 2000), and term low birth weight (tLBW; born >37 weeks and <46 weeks of gestation and weighing <2,500 g; American College of Obstetricians and Gynecologists, 2013). PTB and LBW are the leading worldwide causes of mortality for children under the age of 5 years (Liu et al., 2015); these poor birth outcomes also contribute to morbidity later in life (Behrman & Butler, 2007).

Past studies of environmental contaminants' impact on human health in Appalachia have provided initial evidence for the potential association between surface mining exposures and adverse birth outcomes (e.g., Ahern, Mullett, et al., 2011; Ahern, Hendryx, et al., 2011; Hendryx, 2015). Many of these studies analyzed environmental exposures at the county level, which does not provide direct evidence for associations between surface mining and adverse birth outcomes. We aim to follow the work of Ahern, Mullett, et al. (2011); Ahern, Hendryx, et al. (2011), who incorporated individual-level birth data into their research, by focusing specifically on individual-level analyses in this study. Few studies on this topic have been conducted at the individual level, and additional studies could provide a more precise spatial estimate of exposure for these associations as the data are disaggregated spatially. Fine-scale exposure data may also decrease the risk of exposure misclassification because researchers can distinguish between exposed and unexposed populations more accurately. Additionally, since epidemiological studies increasingly leverage spatially referenced data, “issues associated with selecting the appropriate geographic unit of analysis are… emerging” (Grubesic & Matisziw, 2006, p. 1) as researchers grapple with the modifiable area unit problem (MAUP) and the uncertain geographic context problem (Kwan, 2012). Both of these problems can hide or even falsely enhance relationships between variables when inappropriate spatial areal units are used for analysis. Even when studies are designed to be sensitive to these issues, researchers conducting analysis of any group trends, including using areal aggregation, must be careful to avoid ecological fallacies. Therefore, in this study, we implemented an individual-scale spatiotemporal analysis of potential associations between surface mining and birth outcomes in addition to conducting analyses at three broader spatial scales for comparison.

This study implements four regressions, each at a different spatial scale, to evaluate potential associations between surface mining airsheds and birth outcomes in Central Appalachia. We assert that births that are influenced by airsheds associated with surface mining will result in greater negative birth outcomes than births less associated with surface mining, and furthermore that the relationship will be present when conducting the analysis at multiple spatial scales and levels of aggregation ranging from individual, Census tract, county, and county-sized grid cells. We specifically address the following research question:

Do associations exist between surface mining airsheds and birth outcomes from 1989 to 2015 in Central Appalachia at the individual, tract, or county levels, and across county-sized grid cells?

This research aims to address gaps in the literature regarding the potential impact of airborne contaminants from surface mining activities on birth outcomes by analyzing associations at the individual scale while providing a comparison at aggregated geographic units, as the associations between surface mining and birth outcomes may differ at various scales. Few studies have examined these relationships using a spatial analysis at the individual scale, and this work contributes to our understanding of potential geographic linkages between birth outcomes and surface mining established by Ahern, Hendryx, et al. (2011). In an applied sense, we aim to provide health and public policy analysts with a more thorough understanding of the associations between surface mining and birth outcomes at multiple spatial scales. Furthermore, acquiring individual birth record data can be expensive and working with it can be time-consuming, and therefore we provide analyses at coarser spatial scales (e.g., county) at which data are more readily available.

2 Data and Methods

2.1 Processing and Geocoding Individual-Level Birth Records

Central Appalachian birth records were acquired from the health departments of Kentucky, Virginia, West Virginia, and Tennessee. Birth records located in Kentucky's Central Appalachian counties were provided by the Kentucky Department for Public Health for 1990–2015. The Virginia Department of Health and the West Virginia Department of Health and Human Services provided all birth records in their states from 1990 to 2015. The Tennessee Department of Health provided Central Appalachian birth records located in the Appalachian coalfield area within the region for 2002–2013. This study was reviewed and approved by the Virginia Tech Institutional Review Board (#16-898), Virginia Department of Health IRB (#40221), West Virginia Department of Health and Human Resources, Kentucky Cabinet for Health and Family Services IRB (#FY17-23), and Tennessee Department of Health IRB (#972154).

From the birth records, we derived two individual-level outcome variables: PTB as a dichotomous variable and birthweight in grams as a continuous variable. Since other adverse birth outcomes, such as LBW and tLBW, are determined based on the child's weight at birth, we modeled birthweight in grams to examine potential associations with surface mining airsheds. While analyzing low birthweight as a dichotomous variable is also a valid choice, we chose to treat birthweight as a continuous variable because doing so results in less information loss than dichotomizing it. Last, we only included singleton births in our analyses because plural births have higher risks of perinatal mortality and morbidity (Warner et al., 2000), and births with a birthweight below 200 g were excluded from analysis as these are considered to be nonviable births (Williams & Magsumbol, 2010). Gestation lengths in our analysis ranged from 21 to 45 weeks.

To account for covariates associated with birth outcomes and control for their effects, we derived several individual-level measures from the birth records (Table 1). Based on previous findings, we included the season in which the child was born (Strand et al., 2011), mother's age (Reichman & Pagnini, 1997), parity (Shah, 2010), mother's educational attainment (Luo et al., 2006), mother's race (Dominguez, 2008), child's sex assigned at birth (Mikulandra et al., 2001), and whether the mother used tobacco during pregnancy (Pollack et al., 2000) as covariates. Birth season, derived from the birth date, was split into winter (December, January, and February); spring (March, April, and May); summer (June, July, and August); and fall (September, October, and November). We categorized parity, which refers to the number of previous births to the mother, into four levels (parity 1, parity 2, parity 3, and parity ≥4) where parity 1 captured births to first-time mothers and parity ≥4 captured births to mothers with three or more prior births whose present birth represented their fourth or greater birth. We classified mother's educational attainment into 8th grade or less, 9th–12th grade, and any education beyond high school. We categorized mother's race as either White, Black or African American, or other (American Indian or Alaskan, Asian Indian, Chinese, Filipino, Japanese, Korean, Vietnamese, Other Asian, Native Hawaiian, Guamanian or Chamorro, Samoan, Other Pacific Islander, or Other). Children were either assigned male or female at birth. We adjusted for yearly time trends using a spline (4 degrees of freedom) because the mean gestational birthweight varies over the study time period owing to changes in obstetric practices, such as induced labor and Cesarean delivery (Tilstra & Masters, 2020). The gestational majority year, derived from gestational length and birth date, was calculated as the year in which the majority of gestation occurred. For example, a baby born in February would be assigned to the previous year because the majority of gestation occurred during the previous year.

Table 1. Characteristics of Singleton Births (n = 193,363)
Mother's state of residence
Kentucky 168,687 (87.2%)
Tennessee 11,483 (5.9%)
Virginia 4,720 (2.4%)
West Virginia 8,473 (4.4%)
Child's sex
Male 99,616 (51.5%)
Female 93,747 (48.5%)
Mother's race
White 187,708 (97.0%)
Black 3,017 (1.6%)
Other 2,638 (1.4%)
Mother's age (years)
18–35 174,701 (90.3%)
<18 8,804 (4.6%)
>35 9,858 (5.1%)
Previous births
0 84,445 (43.7%)
1 65,679 (34.0%)
2 28,999 (15.0%)
3 or more 14,240 (7.4%)
Mother's education (years)
<9 7,037 (3.6%)
9–12 108,992 (56.4%)
>12 77,334 (40.0%)
Reported tobacco use during pregnancy
No 135,239 (70.0%)
Yes 58,124 (30.1%)

Birth records from Kentucky, Virginia, and West Virginia included the mother's residential address or mailing address when residential addresses were not available, postal code, city, state, and county. These birth records were then geocoded using the available geographical information with a street-level address locator derived from ESRI's (2013) StreetMap data set. Birth records from the Tennessee Department of Health were received in a geocoded format; officials at the state agency geocoded birth records using the mother's residential address and a combination of address locators supplied by Tele Atlas, Esri, and their Tennessee Strategic Technology Service GIS Service. Birth records outside of Central Appalachia's county boundaries (Figure 1) were removed.

We removed birth records with any missing information after visually verifying that doing so preserved the representativeness of the data by plotting and comparing the frequencies of each covariate's distribution and levels before and after the removal of records with any missing values. Thus, the records used in subsequent analyses were all singleton births at the individual-level, geocoded, located in Central Appalachia, and contained no missing data. Figure 2 outlines the individual-level birth record processing and highlights the number of individual-level birth records left after each data processing step along with the final sample size for each state. We present the PTB rate as the percentage of PTBs over the total number of individual-level records.

Details are in the caption following the image

Data processing for individual-level birth records within Central Appalachia and Central Appalachian coalfields. The final sample size includes records with addresses that can be geocoded and have complete covariate data.

2.2 Surface Mine Delineation

The surface mining data sets for Central Appalachia used in this study were developed and validated using Landsat imagery, the boundaries of Appalachian coalfields as defined by the United States Geological Survey (2020), mine permit boundaries from each relevant state government, the National Land Cover Database (NLCD), and high-resolution aerial imagery (Kolivras et al., 2022). Even though mine permit boundaries highlight where mining is permitted, they do not necessarily show where active mining occurs, which we attempted to delineate in this study. Active mining typically occurs in smaller areas of permitted mining zones; once mining companies extract the coal or other resources in a given area, they revegetate previously mined areas and move to another part of the permitted zone. Since we assume most air pollutants from surface mining are released during the active mining phase, delineating active surface mine sites is necessary for quantifying exposure.

Marston and Kolivras (2021) implemented a three-step approach, with methods based on Li et al. (2015). The first step involved selecting a vegetation index and mosaicking the Landsat scenes that covered the study area together. Marston and Kolivras (2021) selected the normalized difference vegetation index (NDVI) as the vegetation index for this study because past research has shown NDVI can be effectively used to delineate mined pixels from nonmined pixels (Pericak et al., 2018; Townsend et al., 2009). The study area included seven Landsat scenes (30-m resolution), and an annual NDVI composite image was generated from mosaicking the best available leaf-on image for each scene and each year from 1984 to 2015. The second step involved applying a classification and regression tree (CART) regression on training points (bare ground or vegetated land cover). Since the landcover of Central Appalachia is primarily comprised of vegetation, barren areas cleared for mining are quite apparent on images. Training pixels representing nonmining pixels were identified as “barren,” such as agricultural fields or roads, and were excluded from the mining classification category using mine permit boundaries, NLCD data, and aerial imagery. The resulting reclassified images of the CART regression were summed and used to compute a bare ground threshold for each year; pixels that remained vegetated through the study area were classified as vegetated, and pixels that changed classifications were classified as disturbed. Marston and Kolivras (2021) then delineated Central Appalachian surface mines in the third step by implementing a time series analysis that separated mined land from other disturbances, such as industrial development or clear cutting. After validation with high-resolution (1 m) aerial photography, the final classification resulted in an overall accuracy of 88%. The end result of this portion of the study was a single raster layer for each year, indicating mined areas within the study area. For a more detailed explanation of this portion of the project, please reference Marston and Kolivras (2021).

2.3 Modeling Cumulative Frequency Airshed Values

We further processed the surface mine data set by converting delineated mines from raster to vector data format and selecting individual surface mine polygons for each year between 1989 and 2015. Since the United States Office of Surface Mining Reclamation and Enforcement (1999) reported that private surface mines ≥40 acres in size “represent an approximate minimum economically viable size for the respective types of mining operations,” we only selected individual mines with areas ≥40 acres. We then calculated the centroid of each mine using ArcGIS Pro's Feature to Point geoprocessing tool (ESRI, 2020). Figure 3 provides an example of the surface mine filtering process.

Details are in the caption following the image

Surface mine filtering process: (a) imported surface mine data, (b) selected surface mines ≥40 acres in area, and (c) calculated the centroid of the selected surface mines.

To quantify potential exposure to air pollutants generated by mining activities, including fine particulate matter, we created a script to calculate individual airsheds for each surface mine (Kolivras et al., 2022). These individual airsheds were defined by the frequency of air parcels passing through surrounding areas after crossing over their respective surface mines. We downloaded publicly available 12-km resolution North American Mesoscale Forecast System (NAM) data sets for 2010 from the National Oceanic and Atmospheric Administration (NOAA). Among the years for which such data sets are available, 2010 fell nearest the median of the past 124 years climatologically and most representative of long-term conditions in terms of temperature and precipitation. These data sets are comprised of regional weather forecast models and are generated by the National Centers for Environmental Prediction (United States General Services Administration, 2019).

We used HYSPLIT4 to model particles dispersed from individual surface mines (National Oceanic and Atmospheric Administration, 2020). HYSPLIT4 is software maintained by NOAA's Air Resources Laboratory and is used to generate atmospheric transport and dispersion models (Stein et al., 2015). We calculated individual airsheds by generating 48-hr forward trajectories from each surface mine's mean center. We represented individual airsheds as a grid of coordinates at the 0.1° resolution containing their respective frequency values. These grids were interpolated to a 12-km resolution in ArcGIS Pro using the Natural Neighbor tool to create a continuous surface of frequency values for each airshed (ESRI, 2020). Once the interpolated frequency values were built for each surface mine for every year in the study time period, individual airsheds were matched to each birth record's gestational majority year and the maternal residence geocoded location, allowing us to extract the frequency values of each airshed for every birth record.

Finally, we calculated the cumulative influence value for each birth record and majority gestational year by summing the frequency values of all airsheds extracted at each birth record's location. The cumulative frequency airshed values for each birth record were used to represent the potential for exposure to airborne contaminants emitted from multiple surface mines. Figure 4 illustrates an example of the cumulative frequency value calculation for three fictional birth records. In addition to these three records being fictional to protect individual privacy, we present only two airsheds in this figure. The actual cumulative frequency airshed values we used to include a value for every surface mine that affected each birth record. Additionally, Figure 4 shows that summing cumulative frequency values captures surface mining pollutant exposure as a distance decay function. For example, Birth Record 3 is more influenced by Airshed 1 for a mine in the northern part of the study area than the other birth records and is only somewhat influenced by Airshed 2 for a mine in the southwestern part of the study area due to distance from those mines. However, owing to the summation of the frequency values of Airshed 1 and Airshed 2, Birth Record 3 ends up having the highest cumulative frequency airshed value, and thus, has the highest potential exposure to surface mining air pollution out of the three fictional records.

Details are in the caption following the image

Cumulative frequency airshed value calculation for three fictional birth records: the highest frequency values of airsheds are white and the lowest are green. (a) Interpolated frequency values for Airshed 1 indicate greater exposure for fictional Birth Record 3, (b) interpolated frequency values for Airshed 2 indicate greater exposure for fictional Birth Record 1, and (c) airshed raster values were extracted at the locations of the three fictional birth records and summed to generate cumulative frequency airshed values for each fictional birth record.

2.4 Statistical Analyses

We analyzed individual-level birth outcomes and aggregated birth outcome data at three spatial scales. Individual-level analyses were conducted based on the mother's home address. For aggregation levels, we selected tracts and counties from the 2010 Census Cartographic Boundary Files because of their geographically nested structure; we also chose an apolitical unit by generating county-sized rectangular grid cells, which were 953.9 km2 in size, the average area of counties included in our study (Figure 5). For our aggregated analyses, we only selected birth records from areal units that contained 100 or more records per year to avoid unstable rates. The final sample size for the individual level included 193,363 birth records; the tract level included 346 tracts; the county level included 67 counties; and the county-sized grid included 52 units.

Details are in the caption following the image

Spatial scales for aggregated analyses: (a) census tracts; (b) counties; and (c) county-sized grid (areal units symbolized as crosshatches contained fewer than 100 birth records per year and were removed from the aggregated-level regressions).

For aggregated analyses, we aggregated the data by grouping each covariate and outcome variable by gestational majority year and areal unit. The levels of each categorical covariate, such as mother's education, mother's race, child's sex assigned at birth, mother's tobacco use, birth season, and parity, were calculated as percentages. Continuous variables, such as mother's age and the airshed cumulative frequency airshed values, were standardized by their Z-scores then aggregated on their mean values. We calculated aggregated PTB outcome as a rate of PTBs per unit with the denominators being the total number of birth records per unit.

We then implemented regression models per birth outcome: PTB and birthweight in grams (Kolivras et al., 2022). Since birthweight changed throughout the time period due to changes in obstetrics practices, we included a yearly spline variable in our regressions as a nonparametric method to capture the relationship between time and birthweight. For the individual-level analyses, we ran a logistic regression for PTB and a linear regression for birthweight in grams. For the aggregated analyses, we ran Poisson regressions for PTB and linear regressions for birthweight in grams, at each aggregated scale.

The regression equations for the generalized linear model, which encompass the linear, binomial, and Poisson regressions, are shown in Equations 2–4. Note that for the linear regression model, the link function denoted by Equation 1 is the identity function. For the binomial regression, we use the logit link function given by g(), and for the Poisson regression, we use the standard natural logarithm link function (Agresti, 2015). Our final analysis involved performing stepwise regression by running the step() function on each regression in R to remove statistically insignificant variables (R Core Team, 2020). We retained only the significant variables identified by the step() function in the final models
where urn:x-wiley:24711403:media:gh2374:gh2374-math-0002 represents the outcome variable for subject i, urn:x-wiley:24711403:media:gh2374:gh2374-math-0003 represents the expected value for the outcome variable, g(.) is the link function for GLM regression, urn:x-wiley:24711403:media:gh2374:gh2374-math-0004 is the kth dependent variable measured for subject i, and the urn:x-wiley:24711403:media:gh2374:gh2374-math-0005 represents the corresponding regression coefficient

3 Results

Our results indicate that surface mining activities, as represented by the standardized cumulative frequency airshed value, have a statistically significant relationship with two birth outcomes at the four spatial scales examined in this study. The effect estimates of the standardized cumulative frequency airshed values at each spatial scale all have comparatively similar magnitudes, the same signs, and the same significance levels. Additionally, variation exists in the significance of demographic covariates across spatial scales.

3.1 PTB and Surface Mining Airsheds

Table 2 presents the results of the regression analyses run for PTB at the four spatial scales. Although covariate significance varies across scales, the exposure metric of standardized cumulative frequency airshed value was significantly (p < 0.001) positively associated with PTB at all spatial scales. Therefore, at all scales, we find increased odds of PTB ranging from 1.05 to 1.09, or a 5–9% increase in the percentage of PTBs, for every increase in one standard deviation unit of the cumulative frequency airshed value. Table 2 also shows the significance level of other variables, including demographic variables, in the PTB models at the four spatial scales.

Table 2. Preterm Birth: Results of Individual-Level Generalized Linear Regression and Aggregated-Level Poisson Regressions
Variable Individual Tract County Grid
Intercept 0.07 (0.06, 0.08)*** 6.50E−4 (4.58E−4, 9.24E−4)*** 3.73E−4 (6.68E−4, 2.08E−4)*** 1.23E−3 (5.43E−4, 2.78E−3)***
Cum. freq. airshed value: Z-score 1.07 (1.05, 1.08)*** 1.09 (1.08, 1.11)*** 1.05 (1.03,1.06)*** 1.05 (1.03, 1.07)***
Birth season: summer 1.03 (0.98, 1.07) 1.06 (0.88, 1.28) 1.35 (0.98,1.86) 1.49 (1.00, 2.23)
Birth season: fall 0.95 (0.90, 0.99)* 0.90 (0.74, 1.09) 0.63 (0.45, 0.88)** 0.83 (0.56, 1.23)
Birth season: winter 1.00 (0.95, 1.04) 0.97 (0.79, 1.21) 0.58 (0.37, 0.89)* 0.75 (0.44, 1.27)
Child sex assignment: female 0.91 (0.88, 0.94)*** 0.95 (0.81, 1.10) 0.78 (0.56, 1.09) 0.86 (0.59, 1.25)
Mother age: Z-score 1.07 (1.05, 1.09)*** 1.10 (1.01, 1.19)* 1.05 (0.89, 1.23) 1.14 (0.94, 1.38)
Mother edu: 9th–12th grade 1.06 (0.97, 1.15) 1.66 (1.23, 2.24)*** 2.39 (1.46, 3.92)*** 0.65 (0.32, 1.34)
Mother edu: >12th grade 0.93 (0.85, 1.02) 1.52 (1.12, 2.05)** 2.82 (1.75, 4.53)*** 0.61 (0.29, 1.25)
Mother race: Black or Afr. Am. 1.19 (1.06, 1.35)** 1.13 (0.75,1.70) 1.00 (0.45, 2.22) 0.48 (0.11, 2.15)
Mother race: other 0.78 (0.68, 0.91)*** 0.63 (0.36, 1.12) 0.19 (0.06, 0.56)** 0.15 (0.04, 0.54)**
Parity 2 0.82 (0.79, 0.85)*** 0.77 (0.65, 0.92)** 0.81 (0.55, 1.18) 0.98 (0.63, 1.51)
Parity 3 0.87 (0.83, 0.92)*** 0.86 (0.68, 1.08) 0.83 (0.51, 1.35) 0.73 (0.42, 1.29)
Parity 4 1.03 (0.97, 1.10) 0.93 (0.68, 1.27) 1.75 (0.93, 3.29) 0.78 (0.36, 1.68)
Tobacco: smoked 1.26 (1.22, 1.30)*** 1.37 (1.21, 1.55)*** 1.25 (1.07, 1.46)** 1.06 (0.87, 1.29)
Year spline 1 0.98 (0.80, 1.21) 0.66 (0.54, 0.81)*** 0.85 (0.69, 1.04) 0.83 (0.63, 1.11)
Year spline 2 2.83 (2.49, 3.21)*** 1.85 (1.63, 2.11)*** 2.22 (1.94, 2.54)*** 2.27 (1.89, 2.71)***
Year spline 3 1.60 (1.37, 1.86)*** 0.97 (0.83, 1.14) 1.37 (1.17, 1.61)*** 1.49 (1.20, 1.86)***
Year spline 4 1.81 (1.61, 2.04)*** 1.21 (1.07, 1.37)** 1.40 (1.22, 1.61)*** 1.49 (1.24, 1.80)***
  • Note. Model coefficients are presented as odds ratio (95% confidence interval) significance code.
  • Significance codes:
  • ***p < 0.001. **p < 0.01. *p < 0.05. p < 0.1.

Other variables that were significantly positively associated with PTB at all spatial scales included year spline 2 and year spline 4, suggesting that year is an important predictor of the outcome variable. Child's sex assignment (female), mother's race (Black or African American), and parity 3 were only significant at the individual level with negative, positive, and negative relationships, respectively; year spline 1 was only significant at the tract level with a negative association; and birth season (winter) and parity 4 were only significant at the county level with negative and positive relationships, respectively. Birth season (summer) was positively associated with PTB (p < 0.1) at the county and gridded levels whereas birth season (fall) was significant at the individual and county levels with a negative relationship. Mother's age was significantly positively associated at the individual and tract levels. Mother's education levels of 9th–12th or above 12th grade were significantly positively associated at the tract and county levels but not at the individual or gridded levels. Year spline 3 and mother race (other) were significant, positively and negatively, respectively, at all aggregation levels except at the tract level. Parity 2 was significant at the individual and tract levels with a negative association but not at the county and gridded levels. Finally, tobacco (smoked) was significant with a positive relationship at all levels except the gridded level.

3.2 Birthweight and Surface Mining Airsheds

Table 3 presents the results of the four regression analyses run for birthweight in grams. Like the PTB regression results, while variations in the level of significance for each covariate exist across spatial scales for the birthweight in grams regression results, the standardized cumulative frequency airshed value was significantly negatively associated with birthweight (p < 0.001) at all spatial scales. Thus, at all scales, we find a 10.95–13.19-g decrease in birthweight for every standard deviation increase of the cumulative frequency airshed value. Table 3 also shows the significance levels of other variables, including demographic variables, and birthweight at the four spatial scales.

Table 3. Birthweight (grams): Results of Individual-Level and Aggregated-Level Linear Regressions
Variable Individual Tract County Grid
Intercept 3440.92 [10.41]*** 3523.32 [30.27]*** 3699.89 [66.09]*** 3648.41 [75.07]***
Cumulative freq. airshed value: Z-score −12.47 [1.28]*** −13.19 [1.83]*** −10.95 [2.90]*** −12.48 [3.58]***
Birth season: summer −7.65 [3.56]* −1.53 [15.99] −191.15 [31.11]*** −4.69 [39.30]
Birth season: fall −4.04 [3.59] 10.06 [15.84] −56.31 [31.35] −39.60 [37.91]
Birth season: winter −6.81 [3.62] −18.33 [16.56] −89.81 [37.85]* −2.57 [39.94]
Child sex assignment: female −119.24 [2.52]*** −170.15 [12.41]*** −74.00 [27.41]** −64.60 [32.01]*
Mother age: Z-score 4.36 [1.55]** −18.24 [6.85]** 15.63 [15.48] 20.16 [19.46]
Mother edu: 9th–12th 30.91 [6.91]*** −31.93 [27.08] −125.64 [58.99]* −136.66 [69.48]*
Mother edu: >12th 95.75 [7.16]*** 76.35 [27.55]** −72.37 [59.93] −173.52 [70.77]*
Mother race: Black or African American −203.44 [10.18]*** −347.81 [47.98] *** −131.98 [66.95]* 528.80 [160.30]***
Mother race: other −70.17 [10.95]*** −90.40 [55.93] 11.29 [139.09] 350.08 [94.07]***
Parity 2 84.93 [3.00]*** 90.76 [14.43]*** 18.65 [29.92] −16.37 [37.65]
Parity 3 82.19 [4.03]*** 94.19 [19.92]*** −50.41 [43.87] −96.28 [48.58]*
Parity 4 79.40 [5.52]*** 93.92 [27.64]*** 121.21 [65.70] −4.92 [76.78]
Tobacco: smoked −256.99 [2.88]*** −270.53 [11.10]*** −262.94 [19.61]*** −225.92 [24.33]***
Year spline 1 27.83 [14.84] −4.80 [19.76] 0.94 [28.22] 57.01 [34.37]
Year spline 2 −145.28 [9.80]*** −130.71 [15.64]*** −124.77 [22.87]*** −132.20 [28.24]***
Year spline 3 −153.87 [11.32]*** −181.60 [16.26]*** −177.13 [23.80]*** −135.73 [28.73]***
Year spline 4 −138.78 [8.63]*** −164.82 [12.93]*** −145.91 [19.89]*** −116.55 [24.42]***
  • Note. Model estimate notation is presented as “Estimate [SE] significance code”.
  • Significance codes:
  • ***p < 0.001. **p < 0.01. *p < 0.05. p < 0.1.

Other variables that were significant at all spatial scales included mother's race (Black or African American), which was negatively associated at the individual, tract, and county levels but positively associated in the gridded analysis, and child's sex assignment (female); tobacco (smoked); and year splines 2, 3, and 4, which were all negatively associated with birthweight in grams at all scales. Birth season (fall) was only significant, with a negative association, at the county level. Birth season (summer) and birth season (winter) were significantly negatively associated with birthweight in grams at the individual and county levels. Mother's age was positively significantly associated at the individual level, but negatively associated at the tract level; parity 2 was significant at the individual and tract level with a positive association. Mother's education (9th–12th grade) was significantly positively associated at the individual level, but the significant associations at the county and gridded levels were negative; mother's education (greater than 12th grade) was positively significantly associated at the individual and tract levels, and negatively associated at the gridded level. Mother's race (other) was negatively associated at the individual level, but positively associated at the gridded level, and year spline 1 was positively significantly associated at the individual and gridded levels. Lastly, parity 3 was positively significant at the individual, tract, and gridded levels; parity 4 was significant at the individual, tract, and county levels with a positive association.

4 Discussion and Conclusions

Our findings suggest that airsheds derived from surface mining from 1989 to 2015 are significantly associated (p < 0.001) with birth outcomes, regardless of the spatial scale at which the relationship is analyzed. This result suggests that mothers who live within surface mining airsheds, even when controlling for included socioeconomic and demographic variables, have a higher rate of PTB and their babies are born at lower birthweights. Results indicate that increases in the percentages of PTB and decreases in birthweight in grams are expected as the exposure variable, the cumulative frequency airshed value, increases. For example, assuming linearity, the odds ratio for cumulative frequency airshed value of 1.07 at the individual level for the PTB outcome variable indicates we would expect a 7% increase in PTBs for every increase in one standard deviation unit or cumulative frequency airshed value (p-value < 0.001). The cumulative frequency value of −12.47 indicates we would expect birthweight to decrease 12.47 g for every increase in one standard deviation unit of cumulative frequency airshed value when the p-value is <0.001.

Furthermore, the presence of significant associations between cumulative frequency airshed values and birth outcomes at all four spatial scales (individual; tract; county; and county-sized, grid cells) indicates that differing levels of aggregation do not notably change the associations between surface mining airsheds and birth outcomes, suggesting that any statistical bias that may arise from aggregating data may not be an issue with this particular analysis. This finding across spatial scales also indicates that the association between mining and birth outcomes appears to not be sensitive to the scale of analysis, nor to the exclusion of units with fewer than 100 births during the aggregated analysis. The effect estimates for standardized cumulative frequency airshed values show slight variation for both birth outcomes across spatial scales, which we attribute to expected variation in the data. Our research targeted at examining surface mining airshed and birth outcome associations at multiple spatial scales could provide the justification that county-level studies adequately capture results similar to individual-level analyses. It is important to note that we still need to be cautious of ecological fallacies and avoid applying county-level results to the individual level. Our study is consistent with other studies that used county-level exposures (e.g., Ahern, Mullett, et al., 2011; Ahern, Hendryx, et al., 2011; Hendryx, 2015), which may produce similar results as individual-level studies within the context of airshed exposure and birth outcomes. Therefore, dismissal of county-level findings in this specific context because of the coarseness of the unit of analysis is not supported by the results of our study, and future studies of birth outcome-surface mining relationships can potentially justify the use of cheaper and more readily available coarse-scale data based on these results.

4.1 Limitations and Future Research

This study has limitations, which we attempted to minimize, related to geocoding, unmeasured potential confounders, airshed exposure calculations, and determination of gestational majority year. First, locational error is present in the individual-level, geocoded birth records. Since the results of the geocoder were not validated via fieldwork or other validation techniques due to time/resource constraints and the large sample size, geocoding results were presumed to adequately represent the locations of the mothers' residences. We also assumed that the residence recorded on the birth certificate was the mother's residence during the entire pregnancy duration. Geocoding rural addresses tends to result in more error than geocoding addresses in more densely populated areas such as suburban and city locations, given the variable nature of rural roads. We attempted to minimize locational error in the geocoding process as much as possible.

Second, even though this study takes known sociodemographic and behavior factors available on birth records into account, we acknowledge that unknown covariates could have led to residual confounding. While our results are statistically significant, our findings do not fully explain variation in birth outcomes that Central Appalachia experiences. Additionally, some findings regarding the relationship between the included covariates and both PTB and birthweight are unexpected and warrant further exploration in a future study. Future research should consider potential additional covariates such as poverty, prenatal care, stress, gestational hypertension, and substance use during pregnancy.

Third, our study defined exposure in terms of cumulative frequency airshed values as modeled in HYSPLIT4 to 12-km grid cells (National Oceanic and Atmospheric Administration, 2020). The air particle trajectories that we modeled could have underestimated or overestimated the influence of surface mines; we did not conduct field work to measure exposure. Trajectory analysis does not account for the size of each mine or the amount of air pollution it might generate. HYSPLIT4's input data are also modeled, and as with all modeled data, contains errors. We minimized potential error by validating that the frequency values of individual airsheds followed a distance decay curve with higher frequency values closer to the centroid of their respective surface mine. Future work could explore the potential for satellite-derived estimates of particulate matter concentrations and analyze birth outcome and exposures at a fine, localized scale in order to physically measure potential contaminants instead of using the broad-scale 12-km grid cells used here. Furthermore, by using mother's home address, we assumed exposure to these contaminants occurred at home, excluding the likely possibility of exposure away from home. At the coarse county and gridded scales, this potential is of less concern, but as the analysis became more spatially precise, at the Census tract and individual scales, the risk of exposure misclassification becomes greater if the home address is classified as low exposure risk but the woman is actually exposed elsewhere, or vice versa. Thus, a trade-off between spatial precision and accuracy of exposure estimation is made as scale of analysis changes.

Fourth, this study calculated the gestational majority year as the year in which the majority of gestation occurred. We matched cumulative frequency airshed values based on gestational majority year. While birth outcomes may be influenced by events that occurred during the minority gestational year, there is a greater probability of exposure occurring during the gestational majority year as a greater period of time is represented. Future research should consider a fine-scale temporal analysis by examining the trimester with the most critical fetal development and exposure.

4.2 Conclusions

To our knowledge, this is the first study to explicitly conduct a spatial analysis of surface mining and birth outcomes at the individual level and other spatially refined exposure levels such as counties. The study presented challenges in order to achieve the level of granularity of examining impacts at the individual level over a broad study area, but by leveraging recent improvements in geospatial and statistical techniques, we were able to process and analyze over 200,000 birth records and delineate annual surface mines at a 30-m resolution over 31 years in the 43,407 km2 Central Appalachian coalfield region. Our results indicate that surface mining, as reflected by airsheds quantifying the spatial influence of air pollutants released by such activity, is significantly associated with PTBs and birthweight in grams in Central Appalachia; conducting analyses at four different spatial scales did not notably change the associations between surface mining and birth outcomes. Our research, along with other studies published on the topic, could support policies designed to minimize the human health impacts of surface mining.


We thank Michael L. Marston for delineating surface mines in Central Appalachia. We are also grateful for Rebecca Weir, Carole Appel, Carol Ward, and Erin McKnight's feedback and edits. This work was supported by the National Institute of Environmental Health Sciences (R21ES028396).

    Conflict of Interest

    The authors declare no conflicts of interest relevant to this study.

    Data Availability Statement

    Annual active mine extent data for Central Appalachia, along with scripts for completing analyses, are available at https://doi.org/10.7294/20346312. Individual-level birth outcome data are restricted by each agency's Institutional Review Board (IRB) and cannot be made accessible to the public or research community. Birth records are available from the Kentucky Department for Public Health, the Tennessee Department of Health, the Virginia Department of Health, and the West Virginia Department of Health and Human Resources. Data preparation, analyses, and figure creation were completed using ArcGIS Pro version 2.6 (ESRI, 2020), R version 3.6.3 (R Core Team, 2020), and NOAA's HYSPLIT version 4 (National Oceanic and Atmospheric Administration, 2020).