# The scaling exponent of residual nonwetting phase cluster size distributions in porous media

## Abstract

During an imbibition process in two-phase subsurface flow the imbibing phase can displace the nonwetting phase up to an endpoint at which a residual saturation is reached (which cannot be reduced further by additional wetting phase flow due to the complex pore network of the rock and associated strong capillary forces which trap the nonwetting phase). The residual nonwetting phase is split into many disconnected clusters of different sizes. This size distribution is of key importance, for instance, in the context of hydrocarbon recovery, contaminant transport, or CO_{2} geostorage; and it is well established that this size distribution follows a power law. However, there is significant uncertainty associated with the exact value of the distribution exponent *τ*, which mathematically describes the size distribution. To reduce this uncertainty and to better constrain *τ*, we analyzed a representative experimental data set with mathematically rigorous methods, and we demonstrate that *τ* is substantially smaller (≈1.1) than previously suggested. This raises increasing doubt that simple percolation models can accurately predict subsurface fluid flow behavior; and this has serious consequences for subsurface flow processes: hydrocarbon recovery is easier than predicted, but CO_{2} geostorage dissolution trapping capacities are significantly reduced and potential remobilization of residual CO_{2} is more likely than previously believed.

## Key Points

- The residual cluster size distribution exponent tau is substantially smaller (1.1) than previously suggested
- Hydrocarbon recovery is easier than predicted, but CO
_{2}geostorage dissolution trapping capacities are significantly reduced - This raises increasing doubt that percolation models can accurately predict subsurface fluid flow behavior

## 1 Introduction

A phenomenon of key importance in multiphase flow through geological porous media is the formation of a residual phase when one phase is displaced by another immiscible (or partially miscible) phase. The characteristics of such a residual phase determine CO_{2} geostorage capacities (*residual trapping* [*Suekane et al.*, 2008; *Pentland et al.*, 2011; *Iglauer et al.*, 2011]), oil and gas recovery efficiency (this is the volume of hydrocarbon which cannot be produced by conventional means [*Lake*, 2010; *Iglauer et al.*, 2010, 2013]), and contaminant transport and clean-up of nonaqueous liquids from soil (e.g., spilled organic solvents or crude oil) [*Sleep and McClure*, 2001]. Apart from saturation, the most important property of the residual phase is its cluster (droplet) size distribution inside the pore network of the rock.

Many researchers have measured this cluster size distribution, initially by solidifying injected styrene monomers and dissolving rock [*Chatzis et al.*, 1983], and nowadays by X-ray microcomputed tomography imaging [*Iglauer et al.*, 2010, 2011, 2012, 2013, 2014; *Chaudhary et al.*, 2013; *Georgiadis et al.*, 2013; *Geistlinger and Mohammadian*, 2015; *Andrew et al.*, 2014; *Prodanovic et al.*, 2007; *Kumar et al.*, 2009, 2012; *Karpyn et al.*, 2010]. Most of the researchers reported that the cluster size distribution follows a power law *N*(*s*) ∼ *s*^{−τ}, where *N*(*s*) is the count number of a cluster of size *s* and *τ* is the scaling exponent (reported values for *τ* ranged between 1.8 and 2.3). This scaling exponent is physically of high significance: A smaller *τ* implies overall larger residual droplets, which are easier to mobilize [*Herring et al.*, 2013; *Wardlaw and Li*, 1988] and thus lead to higher oil recoveries in the context of hydrocarbon production or solvent recovery from contaminated soil, particularly when chemically enhanced methods are used [e.g., *Iglauer et al.*, 2010]. However, larger drops also pose a higher risk for CO_{2} geostorage schemes as CO_{2} remobilization of residual CO_{2} is more likely in case of larger drops. Moreover, larger drops also have a smaller surface-to-volume ratio, which leads to smaller CO_{2}-dissolution trapping capacities [*Riaz et al.*, 2006; *Pentland et al.*, 2012] and less mass transfer between the solvent phase and the aqueous phase in case of soil contamination. However, the *τ* values in the literature have a large uncertainty mainly due to the methods used to extract *τ*.

We will show here—by using a mathematically rigorous statistical analysis—that *τ* is substantially lower than previously concluded (≈1.1). This raises increasing doubt that simple percolation models can accurately predict subsurface flow behavior. Furthermore, as mentioned above, a smaller *τ* has serious implications: hydrocarbon recovery is more efficient than predicted, but CO_{2} geostorage is more risky.

## 2 Cluster Size Distribution

^{3}was imaged at high resolution (3.5 μm)

^{3}, and the raw image was filtered with a nonlocal means filter [

*Buades et al.*, 2005] and segmented with a watershed algorithm [

*Schlüter et al.*, 2014]. Specifically, homogeneous and clean Bentheimer sandstone (porosity 23.0%; brine permeability 1700 mD; medium grain size 180 μm; pore size range from 1 to 340 μm [

*Al-Yaseri et al.*, 2015;

*Maloney et al.*, 1990]) was completely filled with brine, and oil (n-decane) was injected at a high capillary number (3 × 10

^{−4}) so that an initial oil saturation of 68.5% was achieved. This mimics an oil reservoir. Subsequently, we flooded the sandstone plug with brine at a low capillary number of 7 × 10

^{−7}(which is representative of subsurface conditions), until no more oil was produced and the residual oil saturation state was achieved. The detailed experimental procedure is described elsewhere [

*Iglauer et al.*, 2010, 2012, 2014]. Our experimental data are as follows: We observed

*n*= 3240 individual (disconnected) oil clusters, among which different cluster volumes occurred, (the cluster volume

*s*is given in voxels, note that here 1 voxel = (3.5 μm)

^{3}= 24.875 μm

^{3}). That means that among the 3240 individually observed clusters there were, for instance, 60 clusters with a volume

*s*= 9 voxels, so

*N*(9) = 60. The normalized frequencies are

*N*(

*s*) of clusters of volume

*s*drops rapidly, and it is thus plausible to assume a power law relation. Another way to visually check the possibility of a power law relation is to generate a logarithmic plot of these frequencies or volume counts, respectively, and if such a graph reveals approximately a linear relation, a power law is assumed, see Figure 1 (right). The power law relation is usually expressed as

*N*(

*s*) ∼

*s*

^{−τ}, see above. One key task is to derive the exponent

*τ*. In the next sections we will thus estimate

*τ*with rigorous and nonrigorous statistical techniques.

## 3 Statistical Analysis

*N*, is usually written as

*N*∼

*s*

^{−τ}, i.e., a power law multiplied with a scalar. What is more, the range on which the power law holds is specified by its lower bound and the right upper bound ; i.e.,

*f*(

*s*) in 1 represents a density function and the normalizing constant thus becomes . As the residual clusters have specific volumes (number of voxels), they are discrete, , and in case of a discrete model, the function

*f*(

*s*) in 1 represents the probabilities and the normalizing constant

*α*is thus . Now with

*ζ*(

*τ*) being the (famous), Riemann zeta-function and defining the generalized zeta-function [see, e.g.,

*Clauset et al.*, 2009] as the normalizing constant of a discrete power law can be written as

This last equation allows a convenient calculation of *α* by mathematical software packages (those which have implemented an evaluation routine of the Riemann zeta-function) when
is particularly large, or when
. Further, we denote the cumulative distribution function (CDF) of a power law as *F*(*s*).

*N*which is derived by rounding the values of to the nearest integer number,

*F*(

*s*± 0.5), it is straightforward to approximate the probabilities of

*N*as follows:

Now if, e.g., *τ* = 1.1 and *s*≥10, we obtain 1≤factor≤1.001, which means that *N*, too, follows (approximately) a power law having the same scaling exponent *τ*: *N* ∼ *s*^{−τ}. We therefore limit our discussion to discrete power laws.

### 3.1 Summation Methods

*τ*have been proposed and have frequently been used in the literature [see

*Dias and Wilkinson*, 1986]. We have called them summation methods, because they involve sums of empirical data as approximations of cumulative functions; both methods are in essence linear regressions. The first one is based on an approximation of the cumulative probability

*P*(

*s*) = Prob(

*N*≥

*s*), i.e.,

*Dias and Wilkinson*[1986, (3.6)] that taking the sum

*P*(

*s*). For this reason should scale as

*s*

^{1 − τ}for [

*Dias and Wilkinson*, 1986, (3.6) and (3.7)]. The exponent

*τ*is then derived by fitting a straight line through a logarithmic plot of . However, since the frequencies

*y*(

*r*) are summed up to just 2

*s*− 1 we always have ; hence, this method has a tendency to systematically overestimate

*τ*. In addition, the range in which the straight-line fit is applied depends on the researcher's discretion. For instance, with the typical data we are examining here, applying the fit based on in the range of

*s*= 38,…,71 leads to , whereas if we choose the range

*s*= 38,…,81 we obtain .

*Dias and Wilkinson*[1986] as it has been a standard approach in the literature. It works similarly, but this time the function to be approximated is

*Dias and Wilkinson*[1986] a power law with unbounded domain ( ) is considered; in such a case 3 exists if

*τ*> 2 and it can then be shown that

*M*(

*s*) scales as

*s*

^{−τ + 2}, i.e.,

*M*∼

*s*

^{−τ + 2}. Either for a power law with bounded or unbounded domain, the idea developed in

*Dias and Wilkinson*[1986] is to take

*M*(

*s*). Then should also scale as

*s*

^{−τ + 2}[see

*Dias and Wilkinson*, 1986, (3.8) and (3.9)]. However, this approach contains the assumption that the scaling exponent is known to be larger than 2, because as one can see both functions

*M*(

*s*) and are monotonously decreasing, whereas clearly

*s*

^{−τ + 2}can only be a decreasing function for

*τ*> 2. Therefore, this method is a priori disconnected from empirical data; it is merely a way of defining

*τ*to be larger than 2, rather than estimating it from actual observations. Applying this method to our experimental data Figure 2 shows (a) the typical graph of applying a linear regression to ; carrying out this regression in the range of

*s*= 1,…,50, we obtain an estimated scaling parameter of ; and (b) the cumulative distribution functions reveal that the fit does not match experimental data.

We illustrate this strong bias further through two examples. In the first example we analyze synthetical data; hence, we know the exact scaling exponent, and in the second example we review historical data sets from *Iglauer et al.* [2010].

**Example 1.** *Consider a discrete power law as in (*1*) in the range of*
*and*
. *We have sampled 3200 data from power laws with various scaling exponents* *τ* *and produced an estimate of* *τ* *via the method based on (*4*) with a range of* *s* = 8,…,40 *for the straight-line fit, see Table* 1.

Original | Range of | Estimated | Relative |
---|---|---|---|

τ |
Straight-Line Fit | Error | |

3.0 | [8,40] | 3.08 | 2.6% |

2.5 | [8,40] | 2.55 | 2.1% |

1.9 | [8,40] | 2.18 | 15.0% |

1.5 | [8,40] | 2.07 | 38.0% |

1.1 | [8,40] | 2.02 | 84.1% |

^{a}We have estimated via the method based on 4. The results reveal the method's bias for*τ*> 2.

**Example 2.** *Two sandstones (Clashach and Doddington) have been investigated in Iglauer et al. [*2010*], and a scaling exponent of*
*was derived for both samples via the estimation based on (*4*). Plotting the cumulative probabilities, it is obvious that the result is strongly biased, see Figure *3.

We thus conclude that obtaining an estimation
of the scaling parameter through assuming that
scales as *s*^{−τ + 2} is strongly biased. Finally, since both methods from *Dias and Wilkinson* [1986] are linear regressions, we also mention the discussion of regression pitfalls in *Clauset et al.* [2009, Appendix A].

### 3.2 Maximum Likelihood Estimation

*τ*is consistent, and in addition, if an efficient estimator of

*τ*exists, the method of maximum likelihood will generate it [

*Lindgren*, 1993, chapter 8.6]. In a nutshell, the maximum likelihood method consists of determining that parameter

*τ*, which maximizes the log–likelihood of the occurrence of experimental observations [see, e.g.,

*Lindgren*, 1993, chapter 8.4]. Let

*m*be the number of observations in the range and let

*i*

_{m}be the smallest index such that . Then the log-likelihood function for a power law 1 is

Hence, the m.l.e.
of *τ* is that number which maximizes
, and it can be determined by numerically maximizing 5. Note that there are two additional parameters
and
specifying the range in which the data might follow a power law. As we have seen in Figure 1, the tail observations are very scarce due to experimental limitations (the observed space is finite). For this reason one might overestimate the true *τ* (because a larger scaling exponent implies less data in the tail). Thus, the parameters
and
have to be estimated separately.

*Clauset et al.*[2009] in order to estimate the lower bound is to take that value, which minimizes the Kolmogorov-Smirnov (KS) statistic [see, e.g.,

*Lindgren*, 1993, p. 480]

*Clauset et al.*[2009] that this method of estimating works well. As we also have to cope with a higher cutoff point we proceed as follows: We set the upper cutoff point and then estimate and the corresponding scaling exponent through numerically maximizing 5, Table 2.

m | p Value |
|||
---|---|---|---|---|

54 | 1200 | 1041 | 1.3811 | 9.9% |

8 | 298 | 1871 | 1.1953 | 0.0% |

8 | 150 | 1609 | 1.1231 | 10.1% |

8 | 140 | 1580 | 1.1207 | 12.3% |

8 | 130 | 1558 | 1.1040 | 56.6% |

8 | 125 | 1538 | 1.1064 | 43.6% |

8 | 120 | 1528 | 1.0926 | 94.4% |

*Clauset et al.*[2009] as well. For the obtained fit the KS statistic

*D*

_{m}is computed. Subsequently, synthetic data consisting of two subsets are generated: One subset contains data sampled from the original empirical data in , and the second subset is sampled from the fitted distribution . For this synthetically generated data another power law fit is applied and its KS statistic

*D*

_{synth}is then compared to

*D*

_{m}, i.e., the KS statistic for the original data. This procedure is repeated, here 1000 times, and we then calculate the fraction of times for which

*D*

_{m}is smaller or equal to

*D*

_{synth}; i.e.,

A larger ratio *p* also indicates that the data follow a power law with the estimated scaling parameter. A word of caution has to be made, though. It has been shown in *Clauset et al.* [2009] that a decreasing number of observations automatically increases the ratio *p*. And in Table 2 we are varying the upper cutoff point, such that one might suspect that this is the reason for the increasing *p* value (it remains an open question how to determine reliably the upper cutoff point
). However, as one can see the number of observations for the intervals between [8,150] and [8,120] stays almost constant at roughly *m*≈1500. Finally, we also mention that if one does not estimate the lower bound but determines a *τ*-m.l.e in the range of [2,1200], one obtains
. We thus conclude that the estimated
values lie in the range of 1.09 to 1.12 with high confidence, see Figure 4. Consequently, the estimated *τ* is ≈1.1 and dramatically lower than previously suggested, see above.

## 4 Conclusion

The task of estimating the power law exponent associated with a residual cluster size distribution is both intriguing and important.

One aim of this paper was to illustrate very clearly that the method from *Dias and Wilkinson* [1986] based on 4 as an approximation of 3 is strongly biased and will not generate a consistent *τ* estimate as it simply defines *τ* to be larger than 2—irrespective of the underlying data.

The maximum likelihood method [*Clauset et al.*, 2009], however, is mathematically rigorous; the m.l.e. of *τ* is consistent. We therefore conclude that the high *p* value of 94.4% for the interval [8,120], see Table 2, indicates that for this common data set *τ*≈1.1.

Thus, in summary, we conclude that the scaling exponent of a typical residual cluster size distribution is significantly smaller than previously suggested [*Iglauer et al.*, 2011, 2010, 2013, 2012; *Georgiadis et al.*, 2013; *Andrew et al.*, 2014], particularly by the *summation methods* from *Dias and Wilkinson* [1986]. This raises increasing doubt that simple percolation models can predict multiphase flow through rock with high accuracy (note that percolation theory predicts *τ* = 2.189 [*Larson et al.*, 1977; *Stauffer*, 1979; *Lorenz and Ziff*, 1998]). Furthermore, *τ*≈1.1 implies that oil recovery is easier than previously assumed, while CO_{2} geostorage is more risky.

## Acknowledgments

The authors would like to thank three anonymous referees for very valuable comments. And both authors are particularly indebted to Lukas for his continuing support during the preparation of this paper.