A Participatory Science Approach to Expanding Instream Infrastructure Inventories

Over the past decade, remote sensing data have improved in resolution and become more widely available, bringing new opportunities for its use in environmental science and conservation. One potential application is to identify and map instream infrastructure across the world, with important implications for fisheries, hydrology, flooding, and more. To date, databases of instream infrastructure focus on larger dams with reservoirs that are comparatively easy to detect with remotely sensed imagery. Despite their impact on freshwater ecosystems, smaller infrastructure is often overlooked. To overcome these challenges, we require more systematic approaches, such as the Global River Obstruction Database (GROD) presented here, to map instream infrastructure. We present a participatory approach to identify, map, and validate infrastructure and provide an initial data set for the contiguous United States (n = 4,197). We highlight the value of participatory methods that include the public and suggest ways they could be fused with machine learning for future applications.


Introduction
Remote sensing of Earth's surface has grown into an advanced field, providing high spatial resolution data across the globe (Campbell & Wynne, 2011). These remotely sensed data are increasingly available to diverse users via internet-based services (e.g., Google Earth), along with publicly available computational power to analyze large data sets (e.g., Google Earth Engine [GEE]; Gorelick et al., 2017). The combination of higher-resolution data and improved data accessibility is, in part, inspiring new approaches to data analysis; such methods are often referred to as either machine learning (ML) or artificial intelligence (AI). For example, ML and AI algorithms use vast stores of data to "learn" how to identify different objects, including human modifications to Earth's surface, such as instream infrastructure like road culverts, bridges, or dams (e.g., Weil, 2018). Instream infrastructure are barriers to water flow that cause disturbance of aquatic ecosystems (Poff et al., 1997), and efforts to further catalog such structures are needed to help inform global river restoration and conservation initiatives (Neeson et al., 2015;Poff & Hart, 2002). ©2020 The Authors. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
ML and AI approaches are most often used to identify relatively uniform features (e.g., Craciun & Zerubia, 2013;Tayara et al., 2017;Weil, 2018), and they require spatially distributed, high-quality training and validation data, often called "labeled" data. These data are rarely available for instream infrastructure, requiring large, unified efforts to build these labeled data sets. Simultaneously, researchers and members of the public work together through participatory (Miller-Rushing et al., 2012), or citizen, science to collect, aggregate, and process Earth observation data (Dickinson et al., 2012;Mulligan et al., 2020). Here, we refer to such collaborations as participatory science, with an emphasis on research or development projects, where researchers and members of the public work together, in near real time. These participatory projects are particularly useful for cataloging broad-scale inventories of human-built infrastructure like structures and roads identified in open street map (Goodchild, 2007) and show potential for training ML or AI methods (e.g., George, 2019).
Data generated through participatory science projects can be coupled with ML or AI methods to identify structures over broader areas, more quickly, and with greater reproducibility (e.g., George, 2019). However, despite some robust catalogs of certain types of infrastructure (viz., larger dams and roads), there are relatively few complete inventories of instream infrastructure (Januchowski-Hartley et al., 2013;Poff et al., 1997). Refining how we identify and inventory instream infrastructure is a global conservation priority (Grill et al., 2017). Mapping the spatial locations of different infrastructure types is a primary requirement for spatially explicit river restoration planning and to identify more cost-effective solutions (Neeson et al., 2015) that also return benefits to species of conservation interest (Hermoso et al., 2018;Milt et al., 2018).
The GRanD (Lehner et al., 2011) and AQUASTAT (Food and Agriculture Organization [FAO], 2016) databases offer global coverage, draw on various sources to catalog large (>15 m high) dams, and have varying levels of spatial accuracy. The approaches used to collate and create GRanD and AQUASTAT are challenged by the variety of methods and criteria used to map and classify instream infrastructure across their component, regional databases. Merging existing regional databases can cause over or under representation of infrastructure, with the latter particularly true for areas of the globe where no comprehensive monitoring effort has been undertaken.
Other existing global inventories were developed by bringing together ML and remotely sensed imagery to identify and map dams and associated reservoirs across the globe. Khandelwal et al. (2019) used ML and Landsat imagery to identify reservoirs associated with dams built between 1984 and 2012. Weil (2018) created a ML pipeline during the 2018 GeoForGood User Summit hackathon to map dams using Sentinel-2 10 m imagery. Both approaches used by Khandelwal et al. (2019) and Weil (2018) were limited to large dams and reservoirs and are not suited for locating smaller obstructions because of their dependency on reservoir presence or training data from existing dam inventories, which do not include smaller structures. Finally, the GOODD database (Mulligan et al., 2020) depended on members of the public and researchers coming together to digitize visible dams on satellite imagery in Google Earth with a minimum reservoir length of 500 m and dam wall length of 150 m. Although the GOODD method has challenges with regard to accuracy and consistency of participants mapping infrastructure, and spatial resolution was limited to that available on Google Earth at the time of mapping (2007-2011), we believe there is value in further developing this participatory approach to map smaller infrastructure. A global database of smaller infrastructure could serve as training data for methods like those used by Khandelwal et al. (2019) and Weil (2018), effectively easing previous limitations to cataloging small instream infrastructure via ML.
Here, we present methods, validation, and initial results for the Global River Obstruction Database (GROD), which is being generated through a participatory science project that uses remotely sensed imagery in GEE. Once completed, GROD will be global in scale and include instream infrastructure of different types and sizes that could act as obstructions to flows of water, material, or species, on rivers defined by the Global River Widths from Landsat (GRWL) database (Allen & Pavelsky, 2018). We present an initial validation of instream infrastructure mapped by GROD participants in two regions, one in the United States and one in France. We also present an initial GROD data set for the contiguous United States. Further, we explore the role of the completed data set in addressing global-scale questions about the effects of different infrastructure on hydrology and ecology. Finally, we discuss how GROD data and approach can be used to inform and validate future ML methods that expand and improve global mapping of instream infrastructure and how similar approaches combining participatory science and ML could be used more broadly to expand our knowledge of the Earth system.

GROD Approach
Mapping instream infrastructure for the GROD was initially undertaken by 13 people. These initial GROD participants were recruited through personal connections or by websites such as Twitter (via the hashtag #DamOrNot) and SciStarter. All GROD participants accessed GEE (Gorelick et al., 2017) via a web browser to map infrastructure.
At the start of a mapping session, each participant loaded GEE's basemap satellite imagery, the GRWL river data, and any infrastructure already mapped by that user into GEE using bespoke code for the project. GRWL includes mapped centerlines for >2.1 × 10 6 km of rivers wider than 30 m at mean annual discharge (Allen & Pavelsky, 2018). The bespoke code for the GROD project divides the GEE imagery into 1,039 tiles that are 6°E to 6°W and 3°N to 3°S. Each tile intersects with GRWL centerlines so that there are no empty tiles. Most GEE imagery has submeter spatial resolution, which was determined to be adequate for identifying the majority of instream infrastructure on GRWL rivers.
Within an individual tile, participants scrolled along GRWL river centerlines and, using the tools within GEE, mapped a point feature on top of any observed infrastructure ( Figure S1 in the supporting information) along the river. Participants classified each mapped infrastructure into one of seven predefined classes, grouping by similar flow alterations and visual characteristics ( Figure 1). Our goal with these classes was to strike a balance so that participants could distinguish among classes while infrastructure is grouped in a meaningful way based on likely flow alteration. The classes from maximum to minimum flow alteration were dams (infrastructure that covers the entire river channel, allowing no water to pass except through the structure itself), locks (a subset of dams that are distinguished by their lock mechanism that allows barges and other river traffic move up and downstream), channel dams (any obstruction on a multichannel section of river that does not obstruct all channels), partial dams ≥50% (infrastructure that is impermeable and covers more than half, but not all of the river channel), partial dams ≤50% (infrastructure that is impermeable and covers less than half of the river channel), low permeable dams (infrastructure that has a small height difference upstream and downstream of the structure [i.e., low head dams] or allows water to pass through), and uncertain (the type of infrastructure is not discernible). All classes of instream infrastructure were defined with examples provided in the tutorial document shared with participants before they commenced mapping. To facilitate the mapping and classification of infrastructure, we also asked all participants to review a decision tree ( Figure 2) and a tutorial (see https://globalhydrologylab.github.io/GROD/). At the end of every mapping session, participants exported all instream infrastructure mapped in GEE to a CSV file that included index, class, and latitude and longitude information for all mapped structures. Once a participant mapped all infrastructure on GRWL river lines in a cell, it was flagged as completed on GEE.

Validation
We validated instream infrastructure mapped by the 13 participants for four river reaches: two in France and two in the United States ( Figure S2), drawing on existing regional-scale inventories of infrastructure for the two areas as well as a participant-generated truth data set (PGTD). These four river reaches were selected because (1) an expert user mapped instream infrastructure in the areas, (2) the areas were known to have high infrastructure density and variety, (3) data were available from regional-scale instream infrastructure inventories (detailed below), and (4) completion times were short enough (~1 hr) to allow for repeatability and comparison. For the validation procedure, participants identified and classified infrastructure over the same four river reaches for comparison purposes. All participants mapped instream infrastructure for France, while only nine completed validation for the United States. The two regional inventories included in our validation analyses were the Southeast Aquatic Connectivity Assessment Project (SEACAP; Martin et al., 2014) for the two river reaches in United States and the National Repository of Obstacles to Flow (ROE) for the two river reaches in France. We drew on a modified version of the publicly available ROE database (see Januchowski-Hartley et al., 2019) that excluded destroyed, planned, under construction, invalid, or duplicate infrastructure. We created the PGTD data set by creating manual buffers around each unique instream infrastructure location and then removed features in the lowest quartile of interparticipant accuracy, which primarily removed infrastructure classified as "uncertain." This approach resulted in 93 PGTD-derived instream infrastructure locations for the 10.1029/2020EF001558 Earth's Future validation (down from an original 149 locations). Of the original 149 unique instream infrastructure locations mapped by GROD participants on GRWL river reaches in the two areas used for our validation, only 58 overlapped within manual buffers of obstructions recorded in SEACAP (7 obstructions) or ROE (51 obstructions). Of these 58 instream infrastructure, 3 did not overlap with the PGTD (i.e., they were in the lower quartile of the original 149 instream infrastructure found by GROD participants), resulting in 96 distinct instream infrastructure locations used for comparison in our hit rate analysis.

Earth's Future
We used the two regional inventories and the PGTD to calculate a "hit rate" or the percent of GROD participants who correctly mapped infrastructure at each location that occurred in the validation data sets. Drawing on all 149 instream infrastructure mapped by the 13 GROD users, we used a confusion matrix analysis to determine overall accuracy (for all mapped locations), as well as sensitivity, specificity, and precision for each of the seven infrastructure types (e.g., dam and lock) in the two validation areas (France and the United States). User predictions can take one of four classifications in a confusion matrix output: a true positive (TP) in which a user predicted an infrastructure class at a location which matched the actual class, a true negative (TN) in which a user predicted that an infrastructure class did not exist at a location and the actual class agreed, a false positive (FP) in which a user-predicted class at a location was different than the actual class, and a false negative (FN) in which a user did not predict an infrastructure type at a location but the actual value did. The four metrics were calculated as follows: overall accuracy ¼ (TP + TN)/ (TP + TN + FP + FN); sensitivity ¼ TP/(TP + FN); specificity ¼ TN/(TN + FP), and precision ¼ TP/(TP + FP).

Earth's Future
The validation procedure served as an indicator of our method's performance for identifying and classifying infrastructure and helped refine the method for subsequent iterations of mapping.

GROD Data for U.S. Rivers and Comparison With Existing Data Sets
GROD participants identified, mapped, and classified instream infrastructure on GRWL rivers in the contiguous United States. Participants were trained and familiar with the GROD mapping process prior to contributing to contiguous U.S. data collection. Unlike the validation data set described above, there was no duplicate mapping of infrastructure in this process. Instead, each participant mapped obstructions in a tile and, once completed, moved on to another unfinished tile to continue mapping. In addition to the validation procedure explained above, we also compared the instream infrastructure mapped by GROD participants for GRWL rivers in the contiguous United States with those in the GRanD, GOODD, and SEACAP inventories.

GROD Validation
Our validation showed that GROD participants had perfect identification of instream infrastructure (100% hit rate) when compared with those mapped by SEACAP (n ¼ 7) and a 92% (mean) hit rate when compared with ROE (n ¼ 51) (Table 1). We also found that regardless of the measure (mean or median), combined results from the United States and France had >80% hit rates ( Table 1). Comparison of data from GROD participants with the PGTD (n ¼ 36, United States; n ¼ 57, France) produced a lower mean (70%) and median (78%) hit rate than with SEACAP and ROE (82% and 92%, respectively; Table 1).
Overall users agreed on the obstruction class 70% of the time based on a confusion matrix analysis. Looking more specifically at individual infrastructure classes, we found that dams, channel dams, and low permeable dams had lower (<70%) precision ratios. Notably, we found that dams and channel dams had the lowest sensitivity (58% and 39% respectively; Table 2), and low permeable dams had relatively low specificity (82%; Table 2). We also found that participants frequently classified dams as low permeable dams and vice versa (Table S1), highlighting difficulties of assessing dam height from 2-D imagery.

Instream Infrastructure in the United States
Five participants mapped 4,197 instream infrastructure features on 108,993 km of GRWL rivers across 67 tiles in the contiguous United States using the GROD approach (Table 3). We found that GROD participants mapped 15 times more instream infrastructure than documented in GRanD, 13 times more than GOODD (Figure 3), and 3 times more than SEACAP (Figure 4). Of the infrastructure mapped by GROD participants, Earth's Future we found that 811 were classified as dams and that these most closely represented those contained in GRanD (Lehner et al., 2011). GRanD contained 206 of these 811 dams, which accounted for 84% of total overlapping infrastructure between the data sets.

Strengths and Limitations of the GROD Approach
Our results suggest that GROD provides more detail than existing databases of instream infrastructure over the contiguous United States, whether they are global databases like GRanD and GOODD or more regional databases like SEACAP. Our findings highlight the need for a more complete global database of instream infrastructure, even in regions with relatively good records such as the contiguous United States. Given the long history of human modification of riverscapes, the presence of unrecorded infrastructure that may no longer be actively managed is perhaps unsurprising. An approach such as that presented here can help to address this deficiency.
Still, there are limitations to the GROD approach. Lower hit rates in the PGTD comparison, and in the United States compared to France, were largely driven by a higher number of less obvious infrastructure classes being identified, such as those classified as "uncertain" or "partial dams" by GROD participants (Table S2). Structures in SEACAP and ROE were easier to identify, as they generally include larger infrastructure such as locks and dams with reservoirs that are easier to visually discern. Findings from the PGTD comparison illustrate the difficulties associated with locating and discerning smaller infrastructure that are often not included in existing data sets.
Channel dams had low sensitivity, which can be partially explained by the difficulty for participants to identify if the area in question is one channel of a multichannel river or not. Additionally, sometimes dams are situated at the start or end of a channel or branch of the mainstem of the river, and this can make it difficult to determine if the structure should be classified as a "channel dam" or another class. The introduction of

10.1029/2020EF001558
Earth's Future multiple steps into classifying channel dams could have been another factor influencing inconsistency between users. Participants also struggled to differentiate between dams and low permeable dams, which underscores differences in visual interpretation of imagery or the GROD decision tree among participants.
The inconsistencies noted above are limitations of the participatory approach used to generate GROD, and of participatory approaches more broadly (Theobald et al., 2015). Although the imagery was of high quality for the validation regions in France and United States, the high variety and density of infrastructure along validation reaches likely present a near worst-case scenario in terms of interparticipant consistency. Despite these challenges, overall classification agreement remained relatively high in the validation regions (70%).
The initial validation was limited by a relatively small number (n ¼ 13) of participants. Completion times varied, but participants usually finished in about 1 hr. Performing validation on shorter river sections would have reduced completion time and perhaps attracted more validation participants, but having each individual participant classify a diverse range of structures was informative, even if requiring longer river sections and completion times.
While the number of participants was low, they were diverse in background and age, and many had never been exposed to the GROD application before participating in the validation effort. Even with variable prior experience, all participants completed the validation and showed reasonable consistency in classification and identification of instream infrastructure. From our experience, the validation results instilled confidence that participants with limited prior experience can contribute to collecting GROD data in future iterations of mapping. Finally, our project is still ongoing, and the initial validation steps will allow us to continue to improve our training and decision tree for subsequent mapping and project iterations.
As the GROD project grows, we anticipate additional participants and the potential for further differences in interpretation of the project guidelines and GEE imagery. At the same time, the GROD approach is focused

10.1029/2020EF001558
Earth's Future on identifying instream infrastructure that is not typically included in other global-scale databases. The overall accuracy of participants in mapping different classes of infrastructure suggests that the GROD database will be useful for expanding on existing inventories. As a result, we decided to maintain a variety of different classes, leaving it to end users to group classes that remain difficult to differentiate during post hoc evaluation as needed.
Given the high variety of infrastructure and natural ambiguity in class attribution, some disagreement was expected among participants, but through our validation checks, we have demonstrated that participants can effectively identify infrastructure and agree on their classification in the majority of instances. Our approach to mapping infrastructure is well suited for broad participation and could lead to faster completion of infrastructure identification for the entire globe, along with crowd-sourced verification and validation. In result, GROD could be an ever-updating database of instream infrastructure as new imagery is released. Critically, these updates will be tracked with version control information that is clear and readily available in figshare, our chosen data repository. Anytime GROD is updated, improved, or altered, these changes will be documented in the same figshare link below, but with a new version number highlighting the alterations. This is common practice for dealing with changes and improvements in training data as outlined by the research data alliance (https://www.rd-alliance.org/data-versioning-wg-final-recommendations-and-nextsteps).

Looking to the Future: Fusion of Participatory and ML Approaches
To both expand and improve GROD and similar data sets, we see great potential in merging participatory approaches, which can be accurate but rather time intensive, with ML and AI approaches that are fast but require large training and testing data sets. Training and validation data sets are vital in other fields of image recognition applications, such as facial recognition (Phillips et al., 2005). Most facial recognition software relies on so-called "labeled" data or images where faces and their expressions have been identified by people (Huang et al., 2008). The "labeled" data are used to train ML algorithms that can then encounter an unlabeled image and find the faces in it.
In a similar way, GROD participants have mapped and classified thousands of instream infrastructures. Because it includes a much broader array of riverine obstructions, GROD will be more useful than existing global-scale data sets and more likely to support ML approaches that will succeed in identifying larger (dams) and smaller (partial dams, low permeable dams, or locks) structures (sensu Sheng et al., 2008). Because our classifications are already defined by optical differences, GROD should be well suited to act as training data for ML (e.g., Craciun & Zerubia, 2013;Moranduzzo & Melgani, 2013;Tayara et al., 2017). Participants could validate predictions made by algorithms informed by GROD data and assist in areas of the globe where ML performs poorly. In addition, PLANET data (3 m resolution imagery) have been used to identify all roads and buildings in the world (George, 2019), based on OpenStreetMap training data, demonstrating that these finer-resolution imagery could assist with expanding the inventory of infrastructure captured by GROD, especially for smaller rivers. Finally, we see the potential for ML classification schemes to be applied to infrastructure already identified by GROD participants to create more consistent groupings and classifications.
We support participatory and open approaches to research and see broad possibilities for our approach and code (https://github.com/GlobalHydrologyLab/GROD) to inform and be a part of future fusions between participatory and ML methods. By developing an app-like front end to GEE, we made it possible for anyone in the world (including 13 participants from four different nations initially) to access terabytes worth of imagery, shapefiles, and other information. These data sets, combined, made it relatively straightforward for participants to scroll along rivers and identify infrastructure along the way. While our application focused on rivers, our code is easily modified for participants to hand-draw outlines of disturbances like beetle kill (Edburg et al., 2012), fire (Jones et al., 2016), or hurricane wind fall (Baumann et al., 2014) and to use these outlines to train an algorithm that would expand the scope, accuracy, and speed for identifying such disturbances globally. Finally, our application could be modified to examine any environmentally relevant parameter of interest, and we note the potential to gamify this whole experience as a way to attract hundreds to thousands of more participants, like that presented in EyeWire (https://eyewire.org/explore) (Tinati et al., 2017).

Conclusion
At the global scale, our existing understanding of instream infrastructure impacts on sociohydrology and ecology are skewed toward those caused by larger infrastructure such as dams (see Grill et al., 2017;Liermann et al., 2012), but we increasingly understand that smaller infrastructure are more numerous and likely to have greater cumulative changes on our river ecosystems that remain poorly understood (Couto & Olden, 2018;Csiki & Rhoads, 2010;Januchowski-Hartley et al., 2013;Lange et al., 2019). By mapping infrastructure that does not create large reservoirs behind it, spatially explicit data sets like GROD will enable both finer-resolution analyses and better understanding of impacts caused by instream infrastructure on hydrological flow alterations (Poff & Hart, 2002), geographic range connectivity for species, particularly fishes (sensu Barbarossa et al., 2020), and important fisheries resources (sensu Carvajal-Quintero et al., 2017). Equally, GROD will enable spatially explicit scenarios and potential solutions for restoring hydrological and ecological connectivity through the remediation (fish passages and structure modifications) or removal of existing infrastructure along global river networks (sensu Hermoso et al., 2018, Neeson et al., 2015. To date, GROD includes >4,000 classified instream infrastructure along GRWL rivers in the contiguous United States and expands upon existing global-scale inventories of infrastructure. We view the approach and data generated through GROD as initial steps toward fostering the fusion of participatory and ML approaches. Such joint approaches can turn big remote sensing data sets from raw potential into global information about instream infrastructure and assist us with quantifying and monitoring how our environments are changing over time. Once completed for the entire globe, GROD will be useful in coupling existing knowledge about river systems with new global surface water hydrology data sets derived from satellites, such as measurements of water surface elevation, inundation extent, and river discharge from the upcoming Surface Water and Ocean Topography satellite mission (Biancamaria et al., 2016).