Volume 121, Issue 11 p. 2901-2916
Research Article
Open Access

Ultrahigh-resolution mapping of peatland microform using ground-based structure from motion with multiview stereo

Jason J. Mercer

Corresponding Author

Jason J. Mercer

Jason Mercer and Cherie Westbrook, Centre for Hydrology, Department of Geography and Planning, University of Saskatchewan, Saskatoon, Saskatchewan, Canada

Now at Department of Botany, University of Wyoming, Laramie, Wyoming, USA

Correspondence to: C. J. Westbrook,

[email protected]

Search for more papers by this author
Cherie J. Westbrook

Cherie J. Westbrook

Jason Mercer and Cherie Westbrook, Centre for Hydrology, Department of Geography and Planning, University of Saskatchewan, Saskatoon, Saskatchewan, Canada

Search for more papers by this author
First published: 12 October 2016
Citations: 19


Microform is important in understanding wetland functions and processes. But collecting imagery of and mapping the physical structure of peatlands is often expensive and requires specialized equipment. We assessed the utility of coupling computer vision-based structure from motion with multiview stereo photogrammetry (SfM-MVS) and ground-based photos to map peatland topography. The SfM-MVS technique was tested on an alpine peatland in Banff National Park, Canada, and guidance was provided on minimizing errors. We found that coupling SfM-MVS with ground-based photos taken with a point and shoot camera is a viable and competitive technique for generating ultrahigh-resolution elevations (i.e., <0.01 m, mean absolute error of 0.083 m). In evaluating 100+ viable SfM-MVS data collection and processing scenarios, vegetation was found to considerably influence accuracy. Vegetation class, when accounted for, reduced absolute error by as much as 50%. The logistic flexibility of ground-based SfM-MVS paired with its high resolution, low error, and low cost makes it a research area worth developing as well as a useful addition to the wetland scientists' toolkit.

Key Points

  • Peatland microform can be accurately mapped with structure from motion and ground-based photos
  • Filtering data to remove vegetation improves accuracy of topography estimation
  • Best practices for SfM data collection and processing from ground-based photos are suggested

1 Introduction

Wetlands provide an array of important functions, making them the most valuable terrestrial ecosystem service providers on the planet [Costanza et al., 2014]. For peat-accumulating wetlands (i.e., peatlands), microform is critical to understanding hydrologic and biogeochemical functions as well as the resilience of these functions to changing environmental conditions [Belyea and Clymo, 2001; Waddington et al., 2015] like fire [Benscoter et al., 2015]. For example, developmental differences between hummocks and hollows lead to differences in their hydraulic parameters [Branham and Strack, 2014]. Such differences can influence water table dynamics, shallow groundwater flow patterns [Sumner, 2007; Van der Ploeg et al., 2012; McLaughlin et al., 2014], and carbon sequestration dynamics [Strack et al., 2008] via the provisioning of heterogeneous microenvironments [Frei et al., 2012]. These heterogeneities can also promote greater biodiversity at multiple trophic levels [Pollock et al., 1998]. The problem is that while many peatland biogeochemical, hydrologic, and ecological functions are influenced by peat surface topography, quantifying microform is challenging owing to movement of peat microforms through time [Baird et al., 2016] and a “ground” surface that consists of plants rather than soil [Luscombe et al., 2014].

A diversity of technologies has been employed to quantify peatland microform along a continuum of resolution, extent, precision, and accuracy [Bangen et al., 2014]. One of the simplest methods of measuring peatland microform is the use of measuring tapes [e.g., Nungesser, 2003]. This method is labor intensive and so best implemented at small spatial extents. For larger surveys, total stations (TS) and real-time kinematic global positioning systems (GPS) are frequently used [e.g., Hayashi and van der Kamp, 2000; Werner and Zedler, 2002]. While accuracies between TS and GPS surveys are similar, GPS has the advantage of using an absolute datum, which permits the comparison of peatland positions relative to other features, such as regional groundwater systems. GPS surveys also support a potentially large survey area. Maintaining visual contact between the theodolite and distance meter in TS can be challenging in densely vegetated peatlands. A disadvantage of both TS and GPS is that they are based on point measurements and thus are labor intensive if surveys of a large spatial extent or high point density are required to characterize the landscape. These issues can be overcome with the use of active remote sensing technologies, such as terrestrial laser scanners (TLSs) and light detection and ranging (lidar) systems. TLS and lidar have been used in an increasing number of studies to measure wetland microform [Maxa and Bolstad, 2009; Anderson et al., 2010]. However, a major disadvantage of these methods is the high cost of the equipment and required training. Thus, there is a need for a technique that is able to rapidly measure peatland microform at a high resolution and large extent, yet for relatively low cost.

Structure from motion, a multiview stereo photogrammetry technique [Westoby et al., 2012; Fonstad et al., 2013; Smith et al., 2015], hereafter SfM-MVS, is an emerging technology for bridging the gap between spatial resolution and extent. The technique is leading to an improved ability to tackle sizable research challenges in the biogeosciences, for example, hydrodynamic modeling in rivers [Javernick et al., 2014], erosional processes shaping coastal cliffs [James and Robson, 2012], and biomass carbon storage modeling [Cunliffe et al., 2016]. In some respects, SfM-MVS is similar to traditional stereo photogrammetric techniques in the sense that a passive camera system is used to collect overlapping images that are then matched to one another using tie points and can, using trigonometry, produce elevation information [Smith et al., 2015]. However, the underlying mathematical models are different. In the case of standard photogrammetry, trigonometry underlies the solution to the depth problem via the pairing of image pixels with known ground and camera locations and orientations [Konecny, 2014]. However, the robustness of this technique is dependent on having a high number of ground control points (GCPs) in each photograph [Javernick et al., 2014], which then reduces the practical viability. SfM-MVS, in contrast, is a motion problem that can be solved using factorization and probability to simultaneously determine both camera location/orientation and pixel locations (e.g., ground elevation) within pixel space [Tomasi and Kanade, 1992]. These relative pixel coordinates are transformed into real-world coordinates using a limited number of GCPs or known distances in the photos [Westoby et al., 2012; Fonstad et al., 2013; Javernick et al., 2014]. An increasing number of geomorphic studies that have used SfM have obtained photos using unmanned aerial vehicles (UAVs) [i.e., Woodget et al., 2015] and other aerial acquisition methods such as helicopters [Javernick et al., 2014; Dietrich, 2016]. This is not only an expensive approach, it is likely to become onerous to scientists and practitioners owing to increasing regulation of UAVs and limitations to where they can be employed, for example, in U.S. National Parks [Madden et al., 2015; Spence and Mengistu, 2016].

The utility of SfM-MVS to render 3-D spatial data for rivers, including submerged and riparian areas, has recently been explored [e.g., Westoby et al., 2012; Javernick et al., 2014; Woodget et al., 2015] with results similar to that of lidar [Fonstad et al., 2013]. These demonstrations of the ability of SfM-MVS to quantify the topography of complex riverine environments suggest that the technology holds promise for rendering 3-D spatial data for peatlands. Our purpose was to explore the utility of an inexpensive point and shoot camera system and SfM-MVS to generate high-resolution spatial information of peatlands. Questions we sought to answer were as follows: (1) Can oblique angle photos captured from the ground be combined with SfM-MVS techniques to generate orthophotos consistent with aerially acquired imagery? (2) Can these same photos and techniques be used to generate microtopographic data of a similar positional quality to that of a real-time kinematic (RTK) GPS survey? (3) How are errors (i.e., accuracy and bias) manifest in the elevation data derived from SfM-MVS techniques? and (4) What data collection and processing steps can be taken to reduce these errors?

2 Materials and Methods

2.1 Study Site

We studied a 1.23 ha alpine peatland in the Helen Creek watershed, Banff National Park, Canada (51.683002°E, 116.409119°W, 2362 m). We chose this peatland as we were already studying its hydrologic functions as part of the Canadian Rockies Hydrologic Observatory, which focuses on investigating hydrologic processes of alpine and subalpine ecosystems and their resilience to environmental changes. Low hummocks and occasional pools characterize the microtopography of the peatland (Figures 1a and 1b). Other significant features include strings and flarks that are most evident during high water conditions, as well as a stream that bisects the wetland. The open, herbaceous-shrub wetland cover is dominated by willows (Salix spp.), sedges (Carex spp.), cotton grass (Eriophorum spp.), and moss. Topography surrounding the area is valley shaped, providing an oblique view of the entire wetland along the lateral edges.

Details are in the caption following the image
(a) The traditionally acquired aerial orthophoto of the study wetland, showing the five wells used as registration ground control points. (b) The ground-based orthophoto generated by camera scenario 2-D and the 116 camera positions (white circles) used to generate all SfM data. (c) Vegetation classes used to explore the influence of vegetation structure on SfM error.

2.2 Aerial Photo Acquisition

Parks Canada did an aerial photo survey of Banff National Park during the summer and fall of 2014. Helen Creek watershed was photographed 16 September 2014, ~11 days prior to image capture of the ground-based photos used for SfM processing. The ground resolution of the aerial orthophoto was 0.30 m, and it was used to visually compare wetland pattern with the SfM-derived orthophotos.

2.3 RTK GPS Microtopographic Survey

The wetland was surveyed using a Leica Geosystems Viva GS 15 and CS 15 RTK GPS July–August 2014. We collected 15,286 points at an average density of 1.23 points/m2. Approximately 60 person hours were required to complete the survey, discounting travel time to and from the field. To ensure data were correctly aligned in absolute space, the GPS base station point was postprocessed using Natural Resources Canada's Precise Point Positioning service [Natural Resources Canada, 2015]. The total duration of observations for the base station was 148 h, during which time it was not moved. The base station point was then used to correct the rover points in absolute space. The datum and projection used during the analysis was the Canadian Spatial Reference System's North American Datum 1983, universal time meridian zone 11. Postprocessing mean elevation error was ±0.029 m. Elevation accuracy of the RTK GPS itself was ±0.015 m. Other positional uncertainties (e.g., pole position) were estimated at ±0.020 m [Wheaton, 2008; Anderson et al., 2010]. Assuming these are the only significant errors and that they are independent, normal, and random [Taylor, 1997; Wheaton, 2008; Bangen et al., 2014], the absolute vertical error of the survey is ±0.038 m.

2.4 SfM Data Collection

We took 116 photos under partly cloudy, low wind conditions using a digital, 12.1 megapixels waterproof camera (Canon PowerShot D10). The survey required ~0.75 person hours, which also included measurement of the coordinates of the camera position with the GPS. A standard distance was used from the pole tip to determine the camera offset from the GPS survey, and we later corrected for it. Photos were captured from lateral positions upslope of the wetland from an elevation of between 0 and 12 m above the valley floor in an oval pattern around the study site (Figure 1b), thus incorporating 360° of information. To account for lens distortions during the factorization process, we generated a camera-specific distortion model in Agisoft Lens (St. Petersburg, Russia) as a starting point for the final distortion model, which was field optimized. Carbonneau and Dietrich [2016] argue that the precalibrated distortion model should not be used. Instead, they suggest using the autocalibrated model, available in most SfM software, as it provides a more robust method for determining the lens parameters and results in a better accuracy in the final SfM data. The base of five water table wells (Figure 1a) being monitored as part of the larger study were used as the registration GCPs. GCPs were manually identified in each photo.

2.5 SfM Data Collection and Processing Scenarios

In total, 324 collection and processing scenarios were examined to determine their influence on orthophoto quality and elevation error. The workflow for data collection and processing is presented in Figure 2, and described below. SfM-MVS processing was performed using Agisoft PhotoScan 1.0.4 (St. Petersburg, Russia), while geoprocessing was performed in ArcGIS 10.2 and Python 2.7.3. All data were processed on a laptop running Windows 7 (64 bit) with a 2.60 GHz i5 quad-core CPU, 16 GPUs, and 16 GB of RAM.

Details are in the caption following the image
Work flow for data collection and processing for the 324 scenarios that were compared to RTK GPS data.

2.5.1 Step 1—Camera and GCP Location

Field photos were factorized using unconstrained pair selection. Outputs from three camera position scenarios were compared to determine their influence on the resulting elevation data. Camera scenario 1 used only the camera positions to generate the microtopographic information. Camera scenario 2 used both camera positions and GCPs. Camera scenario 3 used only GCPs and no camera position information. Camera scenarios 2 and 3 represent the most and least amount of positional information used to generate SfM-MVS digital elevation models (DEMs), respectively. For camera scenarios 2 and 3, we further optimized the camera model using the five GCPs identified across the suite of photos. We found that camera scenario 1 was horizontally shifted (3.19 m in the x direction, and 4.77 m in the y direction) such that elevations were incomparable with the GPS data. A horizontal shift was used to register camera scenario 1 with the others, allowing us insight into the relative utility of the information from this camera scenario, even if not aligned in absolute space.

2.5.2 Step 2—Prefiltering

PhotoScan provides automatic point cloud filtering to remove outliers, which while attractive to practitioners has had little scientific evaluation. To determine the influence of the PhotoScan filtering schemes on the quality of output point clouds and their derivatives, we applied three prefilter scenarios to each camera scenario. These were, from most to least rigorous: “aggressive,” “moderate,” and “mild.” Additionally, prior to export from Agisoft, a 15% slope, 4 m moving window filter was used on all scenarios to remove only the most obvious point outliers (i.e., large and discontinuous surfaces).

2.5.3 Steps 3 and 4—Spatial Resolution and Elevation

To determine the impact of resolution on resultant errors and differences between SfM-MVS and GPS elevation data, six resolution scenarios were examined, thereby converting the point cloud to a digital elevation model: 0.05, 0.10, 0.25, 0.50, 1.00, and 2.00 m. Point clouds were decimated (i.e., thinned) and regularized using TopCAT [Brasington et al., 2012], an ArcGIS add-on. We used a minimum of four points to populate a regularized point feature and discarded underpopulated features, which resulted in data loss of <1%. As part of the decimation process, measures of dispersion and statistical moments were calculated for each point, i.e., the minimum and maximum point elevation within each grid, the mean of points, relative elevation range, standard deviation, skew, and kurtosis. Detrended statistical moments were also calculated for each point using the central and eight adjacent cells given that it can provide a better measure of the relative variability [Rychkov et al., 2012] where a terrain trend or vegetation may produce a noisy measurement signal.

2.5.4 Step 5—Global Postfiltering

To remove any additional obvious outliers (i.e., the influence of vegetation plus that of unknown ones), we explored the utility of global postfiltering. In this step, two processing scenarios were evaluated in tandem—detrended mean elevation and relative range—based on the fact that both correlated well with noise in the point cloud. First, a two-sided 1 standard deviation threshold was applied to each data set, which preserves data with a detrended mean elevation close to zero. Second, a one-sided standard deviation threshold (i.e., a range threshold) was applied to log-transformed range values (absolute values) such that larger range values were discarded. This step was sensitive to sample size and when applied removed large proportions of potentially valuable data (i.e., it left large “holes” in the resulting data). To mitigate this issue we instituted a progressive standard deviation filter. A 2 standard deviation threshold was applied to the 0.05 and 0.10 m resolution data, a 1.5 standard deviation threshold was applied to the 0.25 and 0.50 m data, and a 1 standard deviation threshold was applied to the 1.00 and 2.00 m data. This generally resulted in the removal of 3–20% of data, with finer resolution data subject to larger data removal.

2.5.5 Scenario Comparison and Error Analysis

To identify “best practices” for data collection and processing, we evaluated the utility of each of our five data collection and processing steps in reducing errors in the resulting SfM elevation data. The statistical dispersion metric reported is median absolute deviation (MAD), as it is a robust measure of variance that is less sensitive to outliers compared to standard deviation [Rousseeuw and Croux, 1993]. A scaling factor equivalent to 1 standard deviation when the data are normal was used (both represent the same areas under a curve). In our case the scaling factor was 1.4826. Nonparametric statistics were used as data were not normally distributed. First, a Wilcoxon test was performed on each of the 324 scenarios as a cursory test of scenario viability, i.e., identifying which scenarios producing a SfM point cloud with a similar mean to that of the GPS point cloud. Second, a Fisher's exact test was used to determine which processing steps produced data similar to the GPS survey. Third, differences among scenarios within each processing step were tested using Kruskal-Wallis followed by pairwise comparison (Nemenyi) where Kruskal-Wallis results were significant. Differences at p < 0.05 were considered to be statistically significant.

While lumped comparisons of spatial data can be valuable in providing a broad understanding of how error is propagated through different data processing options, spatial context can markedly improve this understanding. We thus computed error, ε, between point clouds using:
where i represents an individual point in the GPS data and its nearest neighbor from the SfM data. Because there is inherent error associated with a given elevation data set [Lane et al., 2003; Wheaton et al., 2010b; Bangen et al., 2014], we applied a global critical threshold of 0.038 m (see section 2.3 for derivation of that value) to the resultant error surface, allowing us to better understand where significant differences in elevation occurred between the two data sets. We provide only a single SfM scenario as an example but ran this analysis on all 114 viable scenarios identified in the data collection and processing analysis. The results of the error analysis informed our vegetation analysis, described next. All statistical analyses were performed in R 3.1.2 [R Core Team, 2015].

2.5.6 Vegetation Analysis

SfM, as a passive remove sensing technique, is sensitive to physical barriers such as vegetation. We investigated whether elevation error could be reduced through considering vegetation. Ad hoc vegetation classes were mapped in a geographic information system environment (Figure 1c) using field notes, the photos used to derive SfM data, aerial photos, and the corresponding SfM-derived orthophoto (to ensure internal consistency). Mapping occurred at a scale of 1:300. We focus on the physical features of the vegetation, rather than the species, so as to make the results more generalizable to field conditions other researchers might encounter. The vegetation class metric we used was qualitative, representing vegetation height, stem type and canopy closure. The eight vegetation classes were closed short herbaceous, closed short woody, closed tall herbaceous, closed tall woody, open herbaceous depression, open short herbaceous, open tall herbaceous, and open tall woody. The open herbaceous depression vegetation class was analogous to open tall herbaceous type, except that it supported open water, which alters the spectral signature of pixels. A post hoc analysis was performed on the most accurate and precise collection and processing scenario identified with statistical analysis (see section 2.5.5) at six spatial resolutions. Bias and accuracy were assessed, as were their statistical differences, as outlined in section 2.5.5.

3 Results

3.1 Orthophoto Comparison

SfM camera scenarios 1–3 generated visually similar orthophotos both among themselves and with the aerially acquired orthophoto (Figures 1a and 1b). While the spectral characteristics (i.e., color scheme) and environmental conditions associated with the collection of the SfM and aerial orthophoto were different, the underlying topographic pattern (e.g., vegetation distribution and stream position) was essentially the same. That is, deviations between the two techniques could be attributed to differences in scale and the underlying elevation information used to geometrically correct the aerial orthophoto. The native resolution of the SfM orthophotos was less than 0.02 m, compared to the 0.30 m of the aerial photo.

3.2 GPS and SfM-MVS Surveys

The median elevation surveyed with the GPS is 2362.37 m (range: 2360.42–2366.09 m) and the MAD is 0.730 m. A 1.00 m resolution DEM derived from the GPS survey (Figure 3a) illustrates the microform of the site. SfM-MVS data were processed in ~24 h per camera scenario, using 12 GPUs. The horizontal and vertical root mean squared errors (RMSE) are 2.06 and 0.052 m for camera scenario 1, 0.03 and 0.084 m for camera scenarios 2, and 0.03 and 0.082 m for scenario 3, respectively. After prefiltering, the point cloud densities of each scenario, although variable, were ~1 point/cm.

Details are in the caption following the image
Surface microform of the alpine peatland from the same perspective but under processing scenarios: (a) 1 m DEM from the 15,286 point RTK GPS survey; (b) 0.05 m SfM-MVS-based DEM; (c) 0.25 m SfM-MVS-based DEM; (d) 1.00 m SfM-MVS-based DEM. SfM-MVS data in Figures 3b–3d are from the vegetation scenario. The images show the SfM-MVS technique did a much better job at capturing hydrologically relevant features, such as the pools (middle of the image) and the stream meanders, than did the GPS survey.

3.3 Comparison of SfM-MVS and RTK GPS Elevation Data

The Wilcoxon test indicated that 114 (35%) of the 324 SfM-MVS scenarios generated were not significantly different from the GPS elevation data, meaning they produced similar elevations. Of the scenarios found to be similar with the GPS data, none of those associated with camera scenario 1 (camera position only) or aggressive prefilter scenarios were significant. The camera scenarios using GCPs were statistically different from the non-GCP scenario but were not significantly different from one another. Similar findings occurred with the aggressive prefilter. Additionally, it was found that scenarios using the minimum elevation of a given resolution were less likely to generate data that was similar to those of the GPS, while the mean and maximum values were equally likely to produce elevation values similar to the GPS data. Other processing steps were equally likely to generate SfM-MVS data that were similar or dissimilar to the GPS data.

For those scenarios similar to the GPS data, point-wise individual differences were calculated to examine the bias-corrected errors between the GPS and SfM-MVS values. A visual inspection of the SfM-MVS errors (i.e., quantile-quantile plots and histograms) indicated they approximated a normal distribution. However, in an effort to not artificially constrain the error through the assumption of normality, nonparametric measures of variance and comparative techniques were used, rather than their parametric counterparts (e.g., t test and standard deviation). Figure 5 provides an illustration of observed errors, characterized using MAD for the error bars, as well as the median offset required prior to comparison with the GPS data. The minimum and maximum MAD values observed were 0.09 and 0.13 m, respectively. The minimum and maximum median offsets were 0.23 and 0.54 m, respectively.

Significant differences were found when comparing MAD and median offset values between the different processing scenarios (Table 1). Resolution scenarios were found to differ by ~0.02 m, with the 2.00 m data being significantly less accurate than all but the 1.00 m scenario. The 0.05 and 0.10 m resolution data were the most accurate, producing MAD values of 0.09 m. Similarly, the 2.00 m scenario was most different from the finer resolution data and supported the largest offset of 0.41 m. The smallest offset was associated with the 0.10 m data at 0.34 m. The most accurate elevation scenarios were those that used mean values, while the least accurate used maximum values. There was an increasing median offset from the minimum to maximum elevation scenarios, where the minimum scenario required the smallest offset.

Table 1. The SfM-MVS Collection and Processing Scenarios Found to be Similar to the RTK GPS Data, but That Generated Statistically Significant Differences in Errors Within Their Subscenariosa
Scenario Subscenario N Median Absolute Deviation (m) Median Offset (m)
a 0.05 m 22 0.091***(e,f) 0.342*(e),***(f)
b 0.10 m 22 0.091***(e,f) 0.341*(e),**(f)
c 0.25 m 20 0.094***(f) 0.342***(f)
d 0.50 m 18 0.097***(f) 0.346
e 1.00 m 16 0.100 0.402
f 2.00 m 16 0.111 0.436
g Minimum 18 0.096 0.314***(h,i)
h Mean 48 0.091*** (i) 0.342***(i)
i Maximum 18 0.097 0.408
  • a Asterisks/letters are used to illustrate which subscenarios are different. Only the first pair that is different is indicated (i.e., the 5 cm resolution subscenario is statistically different from the 1 and 2 m subscenarios, but only the 5 cm subscenario is marked). The median MAD and median offset for all scenarios was 0.093 and 0.347 m, respectively.
  • * p < 0.05
  • ** p < 0.01
  • *** p < < 0.01

Since there were a number of SfM-MVS scenarios that produced elevation data similar to that of the GPS survey (Figure 4), the utility of this method for quantifying microform elevation is illustrated through a comparison of histograms between the GPS data and one of the SfM-MVS scenarios, at multiple resolutions (Figure 5). The highlighted scenario utilized both GCP and camera positions (i.e., camera scenario 2), a mild prefilter, our postfilter, and the mean point cloud elevation, hereafter referred to as the “vegetation scenario.” Data from the different resolutions were colocated with the GPS data, to ensure comparable spatial scales (i.e., the SfM-MVS data was spatially joined with the GPS data according to its nearest neighbor). The SfM-MVS data were similar to the GPS data across all resolutions.

Details are in the caption following the image
Errors for all SfM-MVS scenarios that generated statistically similar data to the RTK GPS survey. Bars represent MAD values (equivalent to ±1 standard deviation).
Details are in the caption following the image
Histograms of the RTK GPS data and one of the SfM-MVS scenarios (i.e., the vegetation scenario) at all but the 2 m resolution. This graphic illustrates the similarity between the RTK GPS information and the SfM-MVS-derived data.

3.4 Spatial Error Analysis

To understand spatial error, SfM data from the vegetation scenario were compared to the GPS survey, but with limited processing steps (i.e., no median shift, no postfiltering) to better convey the kinds of error one might expect from point clouds (Figure 6) used directly from SfM-MVS software (Figure 7). Results show clear patterns of spatial variation in errors, suggesting the occurrence of systematic error (Figure 7). Errors were higher in the northeastern portion of the alpine peatland and lower (and often negative) in the southwest. The distinct areas of high and low error were often, but not always, associated with vegetation types (see Figure 1c). For example, two noisy “strips” of high error bisected the peatland vertically (center to left) and diagonally (lower center to upper right). Sometimes, where abrupt changes in slope occur such as at the margin of the peatland, errors were also high.

Details are in the caption following the image
An example point cloud resulting from camera scenario 2 (using both camera and GCP positions) and a mild prefilter. The view is up valley, approximately to northeast. Red dots are the same GCPs indicated in Figure 1.
Details are in the caption following the image
An example of the spatial error between the GPS point elevations and the derived SfM surface elevations between the GPS and vegetation scenario SfM-MVS surveys. Cooler colors indicate the SfM-MVS data predicted higher than observed elevations while warmer colors indicate underpredicted observed elevations.

3.5 Vegetation Influences on Error

To explore the mixed effect of vegetation and resolution on error, the vegetation scenario was again used, but with postprocessing. Both vegetation and resolution were found to significantly impact error of the SfM-MVS data, as demonstrated by distinct clustering of errors (Figure 8). Generally, the short vegetation classes were associated with the lowest MAD values. This is best represented by the short herbaceous vegetation class, which supported the lowest median MAD (0.061 m) across resolutions. However, the lowest MAD across all vegetation and resolutions was 0.058 m, which was associated with the closed tall woody vegetation class at a resolution of 0.10 m. The maximum MAD (0.143) was associated with the closed short woody vegetation class at a resolution of 2.00 m. The 2.00 m resolution data also produced the largest median error in almost all cases and tended to be statistically different from finer resolution data. Other resolutions tended to not be significantly different from one another.

Details are in the caption following the image
The median offset and median absolute deviation of the vegetation scenario, sorted by vegetation class and resolution.

Some clustering was observed in the median offset. Most noticeable is the division of offsets above and below ~0.35 m. These offsets can be roughly organized between those vegetation classes that are closed or short, versus those that are tall or open. This height and openness division is reflected in the maximum (0.469 m) and minimum (0.277 m) median offsets observed. The maximum value was associated with the open herbaceous depression vegetation class and a resolution of 2.00 m. The minimum median offset, on the other hand, was observed in the open short herbaceous vegetation class at both the 0.05 and 0.10 m resolutions. The open short herbaceous vegetation class also generated an average median offset of 0.280 m across resolutions, which was the lowest of all vegetation classes.

4 Discussion

This study was motivated by the need for an inexpensive, accurate, and rapid method for quantifying wetland structure. We demonstrated that by combining photos from a ground-based point and shoot camera, limited ground control points, SfM-MVS techniques, and an average laptop computer, it is possible to generate ultrahigh-resolution and relatively low error representations of wetland microform—a representation that preserves important wetland functional features such as strings, flarks, and stream channels. Based on our systematic investigation of errors and their sources, we provide recommendations for future collection and processing of SfM-MVS data and suggest avenues for future research.

4.1 Generation of Orthophotos Using Ground-Based Images and SfM-MVS

We have shown that it is possible to use ground-based photos to generate accurate wetland orthophotos at spatial resolutions useful to wetland and other earth system scientists. Further, the resolution (0.017 m) and error (0.03 m) of the best SfM-MVS-based orthophotos we produced are generally superior to those generated by traditionally acquired aerial images, as well as those acquired via a number of other platforms. For example, investigations of wetland microtopography in the UK and Australia utilized airplane-based imagery to produce orthophotos with resolutions of 0.02–0.25 m and horizontal accuracies of 0.25 m [Knight et al., 2009; Luscombe et al., 2015]. UAVs have been the focus of much recent research and have been used to map structure of the Florida Everglades wetlands [Zweig et al., 2015] at a resolution and error of 0.05 m and 1 m, respectively. Satellite imagery has also been used to map a variety of wetland types but at resolutions and accuracies of ~1 m [e.g., Maxa and Bolstad, 2009]. Generally, the trade-off between these different platforms is manifest between resolution and spatial extent. That is, increasing camera distance to the ground results in larger extents and coarser resolution data [Brasington et al., 2012; Spence and Mengistu, 2016]. In this context, our results suggest SfM-MVS products provide an even wider relationship between extent and resolution, making these techniques particularly useful for multiresolution studies (see Brasington et al. [2012, Figure 1] for a more explicit treatment of this concept).

4.2 Performance of SfM-MVS Against GPS and Other Position Acquisition Platforms

A number of collection and processing scenarios, offset by a median value, performed very well in capturing the global microtopographic variability of the study site. The average error across all viable scenarios was 0.093 m. However, errors were reduced when resolution, measure of tendency, and vegetation class were considered. If GPS error is assumed to be the only significant inaccuracy contributing to the SfM-MVS data (a reasonable assumption as the two are being compared to one another), then the technique generated a mean absolute error of 0.083 m (range: 0.078–0.120 m) and mean RMSE of 0.116 m (range: 0.093–0.179 m) across the 100+ viable collection and processing scenarios, making SfM-MVS competitive with other techniques used to measure wetland microform (Table 2). Generally, finer resolution scenarios produced less error than their coarser counterparts. This result may be in part explained by the improved spatial congruence of the point scale measurements of the GPS and finer grid sizes. One other important factor influencing the differences in scale-dependent error is the natural microtopographic pattern of the wetland. Although we found that it is better represented at finer scales than coarser ones, other researchers have found that resolutions ranging from 0.5 to 4 m are not markedly different from one another [Moser et al., 2007].

Table 2. A Comparison of Resolutions, Extents, and Absolute Errors Between Studies, Using Various Collection Methods Illustrating That Ground-Based SfM-MVS Results Are Commensurate or Better Than a Number of Other Techniques
Absolute Error (m)
Bias Accuracy
Wetland Habitat Type Technique Resolution (m) Extent (ha) XY Z XY Z Source
Alpine peatland, Canada SfM 0.05–2.00 1.23 0.23–0.54 0.03 0.044–0.138 This study
RTK GPS 0.81 1.23 0.038
Degraded upland peatland, UK TLS 0.01 1.00 × 10−2 0.01 0.025

Luscombe et al. [2015]

lidar 0.5 0.42 0.047 0.029
Ombrotrophic bog, UK TLS 0.1 7.85 × 10−3 0.108–0.175

Anderson et al. [2010]

Mangroves, Australia lidar 1 20 0.7 0.15 Griffin et al. [2010]
Mangroves, Australia lidar 1 92 0.45 0.05

Knight et al. [2009]

Northern forested, USA lidar 1 6.34 × 103 0.30 0.15

Maxa and Bolstad [2009]

Prairie potholes, Canada TS 1–5 ~0.1 ~0.03

Hayashi and van der Kamp [2000]

The central tendency of mean elevation did not deviate much with scale. Statistical comparison of the mean with other elevation metrics confirms the utility of this tendency, as it produced the lowest over all error. Interestingly, minimum elevation did not perform as well. The three initial point clouds, prior to export from Agisoft for additional processing, displayed a noticeable amount of noise at elevations lower than what could reasonably be considered the surface of the wetland. This “subterranean” noise may be due to a number of factors, including the influence of vegetation, which could produce poor photo alignment prior to factorization. Further research is needed to understand the cause of this noise and ways to correct it.

Spatial error analysis highlighted the existence of systematic error owing to variations in the plant community. Fonstad et al. [2013] found that the presence of plants could generate systematic errors as high and low as 10.54 and −6.56 m, respectively, in river environments. We did observe significant differences between errors when comparing broader vegetation classes. For example, the minimum and maximum absolute accuracy of the general collection and processing scenarios was 0.083 and 0.104 m, respectively. Our systematic errors were surprisingly not as large as those of Fonstad and colleagues, even though the surface of a peatland consists entirely of living and dead plants [Luscombe et al., 2014]. When we accounted for vegetation class, errors decreased by as much as 50%, depending on the spatial resolution. The median offset employed was qualitative, as it accounted for cumulative variations in vegetation height and canopy openness. But the median offset also likely includes other unknown and unaccounted for sources of noise. We think that further exploration of this offset might be a fruitful avenue for future research that will hopefully lead to an ability to differentiate plant heights, such as is possible with lidar.

The findings of this study also indicate that the coupling of a point and shoot camera with SfM-MVS techniques produces DEMs commensurate with those produced via other more expensive and time-consuming technologies, such as lidar and TLS (Table 2). TLS is the only method capable of generating a comparable resolution to that of the data produced here [Luscombe et al., 2015] but requires considerably more ground-based effort (which may even disturb the microform being measured). While TLS data can produce superior results, it should be reiterated that we did not take advantage of the full point cloud, as we were interested in determining if point cloud derivatives inherent to the data structure (i.e., standard deviation) could be used to improve output DEMs. Interestingly, while point densities were similar to TLS data [Brasington et al., 2012], the level of detail we generated occurred over a spatial extent 2 orders of magnitude larger than that of other studies generating higher resolution data in wetlands [Luscombe et al., 2015]. What we have demonstrated then is that SfM-MVS can functionally overlap its active remote sensing counterparts, making it a potentially competitive alternative to TLS when simultaneously accounting for resolution, extent, and cost. An added benefit of SfM-MVS data, as illustrated in Figure 6, is the true color values associated with the data, which can help with interpretation of the point cloud.

4.3 Guiding Principles for Collecting and Processing SfM-MVS Data for Peatland Microform

4.3.1 Collection

Our results illustrate the need for ground control points within the photos when using this technique, at least when using consumer-grade camera systems. It is not surprising that there was no statistical difference between the scenarios using both ground control points and camera positions and ground control points only (camera scenarios 2 and 3), as the underlying algorithms inherent to SfM-MVS simultaneously calculate camera locations and depth. Thus, our results further validate the flexibility of this technique in natural settings and illustrate that only in situ GCPs (as opposed to camera locations) are required to create high-resolution elevation information, even in ecosystems with highly complex microform. This finding could be overridden with ultraprecise cameras, as suggested by James and Robson [2014], but that could negate the costs savings generated by this method.

We used a circular design for the photo survey that incorporates 360° of information, as recommended by Westoby et al. [2012]. Other studies that employed parallel photo capture techniques, like those used in traditional photogrammetry, have found that errors in elevation data tend to produce a “doming” pattern such that errors are systematically different in the middle than along the edges [Rosnell and Honkavaara, 2012; James and Robson, 2014; Javernick et al., 2014]. We did not experience this systematic error, but when encountered it can be corrected for by adding nonparallel photos and more precise camera calibration procedures [James and Robson, 2014].

This study used five widely distributed registration GCPs to align the data in absolute space. A minimum of three points is required to define a plane, which is the absolute minimum that can be used for creating meaningful elevation data. However, it is suggested that a larger number of registration GCPs be used in future studies, with as wide a dispersion of x-y-z information as possible, as this provides the machine vision algorithms more opportunity to further optimize the camera model and locations of the cameras and pixels. Such improved optimization opportunity would likely produce more accurate products than those created here and may even correct our observed bias. Distributing registration GCPs evenly throughout a study area also ensures that the correct rotation in absolute space is achieved [Javernick et al., 2014; Smith et al., 2014, 2015]. Points in a line are scalars of one another, which can produce outputs that are inappropriately rotated. This may have caused of our first camera scenario to not align properly with the GPS data, though it was more likely an issue with the camera model.

The reliance on a large number of verification survey points (15,286 in our case) to correct our elevations in absolute space could negate the cost savings of this technique (even if the resulting product has a much higher resolution). To determine the number of verification survey points required to generate the kinds of precision observed in this study, in absolute terms, we explored the relationship between the number of verification points and the stability of the median offset utilized in each scenario that generated viable elevation data.

We found that the median offset stabilized in all cases. In general, less than 300 verification points were required to reach stability within 1 cm of our results (Figure 9). We did not explore the spatial dependence (if any) associated with the distribution of these survey points. It may also be that bias (i.e., offset) in our data is due to a limited number of registration GCPs, which could influence both the factorization of the photos and camera model calibration. Future research could increase the number of registration GCPs to provide more insights into the influence of these points on error.

Details are in the caption following the image
The number of verification survey points needed to produce absolute elevations within 1 cm of the offsets obtained in this study. Data included here are for all the elevation and resolution scenarios that were statistically similar to the RTK GPS data.

4.3.2 Processing

A number of filtering schemes were performed to determine their impact on the errors associated with measuring wetland microform. Only Agisoft's “aggressive” prefilter produced poor results. This filter generated large holes in the output data sets that then had to be heavily interpolated, resulting in significant deviations between the SfM-MVS and GPS data. Interestingly, we also found that the filtering scheme we devised produced data that was statistically similar to the data that were not prefiltered. While it is not generally a good idea to blindly use black box methods for processing data, from a practical perspective a mild or moderate amount of filtering may be reasonable in some circumstances. Much opportunity exists in creating filtering schemes that are better suited to these types of data.

We found that the mean point cloud elevation most efficiently captured wetland microform at a number of resolutions. At this early point in the development of this technique, we recommend using the mean or similar measures of central tendency (e.g., median) to describe wetland elevation when mesh decimation is desired, such as the case when error assessment must be performed without independent measures of elevation [Wheaton et al., 2010a]. In future iterations of this research, improvements may occur such that minimum values better represent wetland microtopography, which should provide a way of resolving vegetation height from maximum values. The current implementation of this method, though, suggests other measures of tendency are better suited to represent wetland microform.

4.3.3 Other Considerations

There are a number of other parameters and circumstances that future research should focus on in an effort to reduce error and extract additional structural information of wetlands. For example, we did not explore the influence of photo number, quality, pixel resolution, or radiometric resolution, or different photo position schemes (e.g., angle of photo relative to the surface of interest). The influence of environmental setting, in particular the influence of wind, lighting, and water, could generate errors during the pixel-matching phase of this technique. For example, wind may move plants, especially leaves, between photos making the location of the pixels associated with those leaves less certain, potentially distorting areas in their vicinity. Changes in lighting during the photo collection phase could disrupt the hue, value, and chroma used to assist in the feature transform [Lowe, 2004], especially on water, where reflections of adjacent topography do not align well at multiple angles (i.e., Fresnel reflection).

4.4 UAVs Are Not Always Necessary

There has been a surge of interest in the use of UAVs to capture structural information of landscapes, including wetlands [e.g., Madden et al., 2015; Spence and Mengistu, 2016; Zweig et al., 2015]. SfM-MVS is a natural ally in this endeavor as it provides a means of extracting additional, and extremely useful, elevation data from photos obtained using UAVs. However, we have shown here that aerial systems are not always necessary in the generation of either of these data. In our case, the natural topography of the area provided the perspective required to generate wetland structural data from oblique photos. Conceivably, other oblique sampling schemes could be utilized in situations with less topographic relief (e.g., ladders and kites).

5 Conclusions

This paper represents a first step toward the edge of a data-driven revolution that will generate new understanding of structure, function, and process at spatial and temporal scales not yet considered in peatlands or other ecosystems. While there is still much to understand related to the generation of SfM-MVS-based point clouds, we have shown here that coupling SfM-MVS with ground-based photos taken with a point and shoot camera is a viable and competitive technique for generating ultrahigh-resolution maps of peatlands. These maps have accuracies and extents comparable to or better than their relatively expensive counterparts, such as lidar and TLS. Further, we tried 300+ different processing combinations, and of those 100+ provided reasonable topographies. But for this environment, we found that a filtering process to remove vegetation provided the best means to map the underlying topography. Applying a vegetation class filter, which considers plant height and canopy openness, improved accuracy by as much as 50%.

There exists considerable potential for using the elevation and spectral information provided by this ground-based SfM-MVS technique to produce products in addition to orthophotos and DEMs that would be useful to the study of wetland ecosystem structure and function. The underlying data structure of the orthophotos is a point cloud located in three-dimensional space, which is then stacked to create a two dimensional image. Further, the underlying machine vision algorithms most commonly utilized by SfM-MVS techniques, such as scale-invariant feature transform [Lowe, 1999, 2004], do not require spectral data from the visible band. Infrared, or other spectral frequencies, could be used alone or with visible light data to generate point clouds that are then used to infer other information or potentially further constrain and improve outputs. Thus, the raw product of SfM-MVS can be considered a three-dimensional point cloud coregistered with its underlying spectral data. With future innovations, such information might be used to independently extract specific pieces of information for peatlands, like moss elevation, vascular structure, or thermal profiles.


This research was supported by grants from the Global Institute for Water Security, the Natural Sciences and Engineering Research Council of Canada (Discovery Grant RGPIN 32837-20), the Canadian Foundation for Innovation, and the Society of Wetland Scientists student research grants program. The thoughtful reviews of James Dietrich and an anonymous reviewer greatly improved this manuscript. We thank Banff National Park for provision of aerial photos and field logistical support. Site photographs and GPS data are available through emailing the corresponding author.