Can Recurrence Quantification Analysis Be Useful in the Interpretation of Airborne Turbulence Measurements?
Abstract
In airborne data or model outputs, clouds are often defined using information about Liquid Water Content (LWC). Unfortunately LWC is not enough to retrieve information about the dynamical boundary of the cloud, that is, volume of turbulent air around the cloud. In this work, we propose an algorithmic approach to this problem based on a method used in time series analysis of dynamical systems, namely Recurrence Plot (RP) and Recurrence Quantification Analysis (RQA). We construct RPs using time series of turbulence kinetic energy, vertical velocity and temperature fluctuations as variables important for cloud dynamics. Then, by studying time series of laminarity (LAM), a variable which is calculated using RPs, we distinguish between turbulent and nonturbulent segments along a horizontal flight leg. By selecting a single threshold of this quantity, we are able to reduce the number of subjective variables and their thresholds used in the definition of the dynamical cloud boundary.
Key Points

A vector characterizing turbulence along the aircraft track based on velocity and temperature fluctuations is constructed

A time series of this vector allows the construction of a Recurrence Plot, a tool used in studies of dynamical systems

A quantity named laminarity, derived from the Recurrence Plot, is used to detect turbulent and nonturbulent regions
Plain Language Summary
Cloud is defined by the presence of liquid (or solid) water in the atmosphere. By agreeing on a certain threshold of liquid water content, one can define a cloud boundary. Cloud processes result in disturbances in the airflow around the cloud (turbulence). Its presence should allow the cloud dynamical boundary to be defined in a similar manner. However, a single simple variable, which could be used in an analogous way is not well defined. In this study we propose a method to define such a variable to distinguish between turbulent and nonturbulent volumes around the cloud (i.e. determining dynamic cloud interface), adopting a method used in the study of dynamical systems. Based on recorded temperature and air velocity fluctuations characteristic for turbulent transport and mixing, a threshold of a variable named “laminarity” is used to define the dynamic cloud boundary on the aircraft trajectory.
1 Introduction
Understanding atmospheric turbulence is crucial to properly describe transport processes in the Atmospheric Boundary Layer (ABL) and in the free atmosphere, mixing processes in clouds, and even rain production (Bodenschatz et al., 2010). Airborne measurements provide direct information on turbulent velocities and associated fluctuations of thermodynamic fields (Wendisch & Brenguier, 2013). Statistical interpretation of these measurements is, however, problematic. Time series collected in the course of airborne measurements represent conditions along a onedimensional, complicated trajectory inside evolving and coupled fields of momentum, temperature, humidity and cloud particles. In contrast to laboratory experiments, data collected in the course of a flight do not contain information on the boundary and initial conditions of the flow, on external forcings and internal feedbacks. There is even no general rule to distinguish to what extent the observed fluctuations can be contributed to turbulence, waves, or inhomogeneities of nonturbulent nature. Thus an objective division of the collected data series into segments representing similar stages/properties of the flow is crucial to construct conditional statistics of measured variables to characterize phenomena along the flight trajectory.
Typically, additional information and assumptions are necessary to construct a reasonable conditional statistics. To explain this in more detail, consider for example, cloud/clear air turbulent mixing. While the cloud microphysical boundary can be defined in a data series by a selected threshold of liquid water content (LWC) or droplet number concentration (DNC) (Malinowski & Zawadzki, 1993), cloud dynamical boundary, that is, the region around a cloud where the flow is affected by dynamical (Ackerman, 1958) and thermodynamical (Radke & Hobbs, 1991) processes related to the cloud, is not unique. In airborne measurements proposing through stratocumulus cloud top (Malinowski et al., 2013) three parameters: turbulence kinetic energy, potential temperature gradient and wind shear, were used to distinguish between the four different layers in the cloud top vicinity. Neither the parameters, nor threshold values could be considered objective, yet the division was algorithmic and allowed to describe many properties of turbulence and turbulent mixing in stratiform clouds. Such divisions are even more difficult in case of convective clouds penetrated horizontally. Heus and Jonker (2008) have shown the existence of shells around clouds where dynamical interactions occur. While cloud modelers who obtain two or threedimensional thermodynamic fields may use such parameters as enstrophy (Mellado, 2017) or more advanced numerical approaches (Nair et al., 2021) to obtain a border between turbulent cloud and nonturbulent environment, a similar approach cannot be adopted to airborne data along a 1D trajectory.
This work describes an effort to construct a robust, objective method to classify time series collected in the course of the flight by “turbulent” (dynamic) properties using a small number of apriori assumptions. The idea is to apply a technique which is widely adopted in nonlinear time series analysis called Recurrence Quantification Analysis (RQA) (Marwan et al., 2007), which can be used to study transitions in the system. It revolves around the concept of recurrence of the dynamical system into the same regions of the phase space, which is one of the fundamental aspects in the analysis of such systems. It has found applications in many fields such as physiology, neuroscience, economy, and earth sciences. There are many examples of the usage of the Recurrence Plot method, but in this study, we mainly refer to a comprehensive review by Marwan et al. (2007), containing the necessary information about the method and an indepth look at its variations and its applications.
In the following analysis, dynamic (turbulent) properties of clouds and clear air in between are characterized by three variables, representing important dynamical processes in clouds: turbulence, characterized by its kinetic energy TKE, mixing, characterized by temperature fluctuations T′ and vertical transport (updrafts and downdrafts) characterized by vertical velocity w′. These variables, normalized according to the RQA approach, form a 3D phase subspace. Recurrences of a vector with these three variables into the same regions of this phase subspace, indicate segments of the time series representing similar dynamic properties of the flow.
To test applicability of this approach we use airborne data collected in the course of the EUREC^{4}A (Elucidating the role of cloudscirculation coupling in climate) experiment (Stevens et al., 2021) with the British Antarctic Survey (BAS) Twin Otter (TO) research aircraft, equipped to measure turbulence, temperature fluctuations, and cloud properties. Cloud mask defined by a threshold of LWC, is added to the comparison, allowing turbulence inside and outside of cloud to be distinguished.
The structure of the article is the following. In Section 2 we describe the data, Section 3 discusses the details of the timedependent RQA method adopted, and in Section 4 the results are presented and discussed. Final conclusions can be found in Section 5.
2 Data Description
EUREC^{4}A (Stevens et al., 2021), was a measurement campaign conducted in late January and early February 2020, with a focus on the study of coupling between atmospheric boundary layer and ocean surface, and the study of marine cumulus clouds and their spatial organization near Barbados. One of the research aircraft present was the British Antarctic Survey's TwinOtter (TO). The aircraft flew at different altitudes performing cloud penetrations, collecting data between the clouds as well as in the atmospheric boundary layer (ABL), even near the ocean surface.
The TO was equipped with Meteorological Airborne Science INstrumentation (MASIN) gathering data at two frequencies: 1 and 50 Hz. The data set contains GPS plane position, pressure altitude, three components of wind velocity from the Best Air Turbulence (BAT) probe (Crawford & Dobosy, 1992; Garman et al., 2006), temperature and H_{2}O molar ratio. There were cloud spectrometers, measuring cloud microphysical properties, as well as an additional instrument to measure temperature fluctuations, the UltraFast Thermometer 2b (UFT2b) from the UFT family (Kumala et al., 2013) with an effective resolution reaching 2000 Hz.
Not all collected time series were suitable for analysis due to instrumental problems. The UFT2b provided useful data from 7 flights, in 5 of these cloud penetrations were recorded. In these flights three 50 Hz velocity components (turbulence data) from the BAT (Best Air Turbulence) probe were available, yet in some segments of flights with deteriorated resolution. Records from BAT and UFT2b were used in this study. These instruments were colocated within 10 cm distance on the nose boom of the aircraft, thus after averaging temperature fluctuations to match the 50 Hz signal from the BAT probe (corresponding to ∼1.3 m spatial resolution at 65 m/s true air speed (TAS)) data can be considered as collected in the same volume.
Additionally, LWC from cloud spectrometers, located under the wings (∼5 m apart) was used to define cloud mask. In the first two flights it was the data from the FastForward Scattering Spectrometer Probe (FFSSP) (Baumgardner et al., 1993; Brenguier et al., 1998) with 50 Hz resolution, and for the remaining three, from Cloud Droplet Probe (CDP) (Lance et al., 2010) with 1 Hz resolution, which is equal to around ∼65 m spatial resolution. A total of 246 clouds defined as segments with LWC > 10^{−3} g/kg at 1 Hz were found. Only a fraction of these data were selected for the preliminary analysis described here: a fragment of a time series collected at constant altitude during flight TO336, starting at around 12:42 UTC. Since the typical TAS of the TO was about 69 m/s, and velocity fluctuations were of order of a few m/s, the frozen flow hypothesis is applicable to data interpretation.
3 Data Analysis
According to Section 1.2.2 of Marwan et al. (2007), visual inspection of patterns appearing in the recurrence matrix makes it possible to distinguish different types of trajectories of the investigated vector in the phase space. In particular, white areas or bands correspond to rapid changes, solitary black points/squares to rare events/heavy oscillations, chessboard like structures to periodic events, and diagonal lines suggest parallel segments of trajectories in phase space. In Figure 1 a change of the recurrence behavior of the time series after 3 s from a slowly changing state to rapid fluctuations is visible indicated by a white band. Additionally, two vertical lines around 4 and 4.5 s correspond to the values of [T, k, w′] similar to that in the first 3 s of the plot, visible as a black area on the matrix. These vertical lines in general indicate a presence of an intermittent state, which occurred previously in the segment. The phase space corresponding to the data segment presented in Figure 1 is presented in the Figure S1 in Supporting Information S1.
Some of the abovementioned properties of each matrix can be characterized by one or more measures proposed by Marwan et al. (2007) in the overview of recurrence quantification analysis (RQA). Time series of such measures can be interpreted in terms of varying properties of turbulence along the trajectory of the aircraft.
Finally, in order to understand the relative importance of turbulence kinetic energy, mixing and updrafts/downdrafts analogous analysis, construction of recurrence matrices and following RQA, was performed for each vector component separately. The example application of the procedure described in the previous section is presented in Figure 2.
The top panel of Figure 2 shows three time series of laminarity calculated for various sizes (512, 256 and 128 points corresponding to ∼704, ∼352 and ∼176 m distance in physical space) of the RP. It can be seen that the black curve representing an RP of size 512 × 512 hardly reflects dynamic disturbances related to small clouds, such as the one at t ≈ 7,350 s. Supposedly ∼700 m averaging, imposed by this choice, “smoothes out” turbulent events of the size comparable to 1–3 integral scales and should be avoided. The blue curve representing LAM calculated for a 128pointlong segment is most sensitive to changes in the environment, but shows strong oscillations, which suggest that so short RP may capture separate events in small scales, which could lead to too detailed division into segments. In general, increasing the size of the matrix smooths out the timedependent RQA, and broadens the peaks of the quantities. This is an effect of capturing a bigger picture around a selected point and related smoothing, as seen for example, at t ≈ 7,250 s, where the black curve does not give insight into the variability of dynamics captured by the red and blue curves corresponding to smaller matrices. Additionally, 256 points/∼352 m corresponds roughly to 2–4 integral scales of turbulence, and distinguishing between turbulent/non turbulent regions on a distance corresponding to a few integral scales seems reasonable.
The bottom panel of Figure 2 shows a plot of LAM time series for the whole vector (lightblue curve) and for its components: T, k and w′. Gray shading indicates the separate use of a cloud mask indicating the presence of cloud water (at 1 Hz, i.e. ∼69 m spatial resolution). It can be seen that considering the whole vector gives the complete picture of turbulent regions within and around the clouds, while laminarity calculated for each component separately shows the relative importance of k, T, and w′, that is, mechanical turbulence, mixing and vertical transport which change along the time series. In example, decrease of laminarity around 7,250 s indicates turbulent region outside the cloud, in which w′ and k contribute more than T to the decreased laminarity estimated from the whole vector, while at ∼7,300 s, inside the cloud, contribution from T dominates, while contributions from the other parameters are small.
4 Results
Figure 3 shows the time series for all three components of the vector, LWC series as the gray cloud mask and LAM calculated for the whole vector, with yellow shaded regions corresponding to LAM below the selected threshold, that is, to the regions where a signature of turbulence is observed. Since the goal is to distinguish between the turbulent and laminar regions, a threshold just below 1 should be the cutoff. After visual inspection the specific value of LAM = 0.95 was chosen. Selection of LAM = 0.99 resulted in selection of the majority of the record as turbulent, while LAM = 0.90 resulted in omission of segments where some clear signatures of turbulence were present. The exact value of the threshold can be discussed, however the goal of the selection is not the detailed turbulence analysis, but selection of the segments on which such analysis should be performed so the detailed number is of secondary importance.
Figure 3 shows that the investigated region is composed of clouds with vast volumes around the clouds affected by turbulence and laminar regions with occasional signatures of turbulence, not related to visible amounts of LWC. All clouds in the analyzed data are turbulent, with clearly visible local minima of LAM values below ∼0.75. The local minima of laminarity reflect coincident input of all the components of the vector, but there is no clear dependence to the actual value of LWC.
Additionally, for each cloud, regions with LAM < 0.95 around the cloud can be considered “turbulent cloud shells”, and borders of these shells can be determined by the LAM threshold: such a division will help in more detailed investigation of cloud dynamics.
To close this analysis we performed initial statistical analysis of turbulence in the investigated time series. Table 1 summarizes the average turbulence kinetic energy, vertical velocity variance and temperature variance in 3 different types of segments along the trajectory: clouds, where the LWC is above threshold, turbulent regions with no clouds, where LAM is above threshold and nonturbulent regions in between. Not surprisingly, these values are very different, reflecting different dynamics along the flight trajectory.
LAM^{>}  LWC  LAM^{<}  

μ_{k}  0.79  2.46  1.51 
σ_{T}  0.32  0.32  0.34 
σ_{w′}  0.58  1.31  0.93 
As we can see the values of the mean TKE are the highest for cloud regions, but they are also significantly higher for regions with LAM below threshold than for LAM above threshold. Standard deviation of vertical velocity follows the same trend as the mean TKE, while the standard deviation of temperature is not significantly different between the regions. However, more detailed inspection of the data indicates, that the standard deviation of the temperature in laminar region comes from large scale smooth temperature variations, while in turbulent regions it comes from small scale variability. Consequently, the mask based on the changes of LAM allows for better division of data in terms of turbulence characteristics.
5 Summary and Outlook
In this study, we proposed a method to algorithmically divide time series of airborne measurements according to dynamic (turbulent) properties of the sampled air, combining the simple signature properties characterizing turbulence, turbulent transport and turbulent mixing. The method is based on recurrence quantification analysis and the assumptions adopted are based on welldocumented physical characteristics of atmospheric turbulence. Since elements of the vector are normalized based on their fluctuations, consecutive application of the method for flight legs at various heights does not require selections of arbitrary thresholds for each leg, allowing for using the same method for all analyzed horizontal segments of the flight. Such approach significantly reduces arbitrary decisions which have to be undertaken in the construction of the conditional statistics. Thus the proposed approach will facilitate future investigations of atmospheric turbulence in, within and around clouds, which requires a more systematic approach. Another advantage of the RQA method is that the constructed vector can be expanded with more measured quantities, such as for example, specific humidity, or other passive scalars. Selection of the variables included in the analysis can vary, depending of data availability and quality as well as research needs.
We used only one metric from the RQA, laminarity, but RQA offers a variety of metrics, therefore it is possible to expand the studies and/or use metrics that are more suitable for other types of analyses. The possibilities in the future with regards to this type of study include imposing an additional condition based on the LWC signal of a better resolution added to the vector, as well as applying the method to data gathered during other campaigns, where the requirements to perform the similar analysis (straight, horizontal flight leg, presence of clouds or abrupt changes) hold.
We have applied this method to an environment with clouds, which have a strong contrast between regions with stronger turbulence and the surrounding environment. The method is not universally applicable, since typical flight patterns in airborne measurements consist of slant profiles and horizontal legs, and the latter can include cloud penetrations or not. A cloud penetration is characterized by abrupt changes in temperature and vertical wind velocity, therefore such events affect normalization, which in turn affects the distribution of points in the phase space, which has a direct connection to how the RP looks, and the values obtained in the RQA. Therefore, the data studied with this method should in principle contain such abrupt events, although their magnitude can vary.
The selected threshold is subjective and its value should not be interpreted universally. On the other hand, all the distances and scales used in the analysis as well as the values and fluctuations present in the analyzed time series are typical and in agreement with published data. Thus, the method proposed here should be easily adaptable to other similar data sets as well as to the interpretation of numerical simulations, in which for example, independent estimates of cloud boundary based on LWC and on dynamic properties of the flow should help not only to characterize properties of turbulence inside and around the cloud, but to enhance our understanding of entrainment, mixing and vertical transport in clouds.
Acknowledgments
The authors would like to thank the reviewers and the editor for their help. We acknowledge funding by Poland’s National Science Centre Grant UMO2018/30/M/ST10/00674 and the Natural Environmental Research Council Grants NE/S015868/1 and NE/S015779/1.