Research

Here is a brief description of my research works. All the reference can be found in publication page.

I have broad research interests. I am an applied mathematician and focus on many applied math topics. Meanwhile, I work toward the direction that belongs to the applied sciences and possibly beyond applied math. My PhD degree was in mathematics and atmosphere and ocean science. I also work on many engineering problems and some neural science topics.

I divide my research topics into different categories for better understanding of these works, but they are tightly interconnected.

Applied math side: modeling high-dimensional complex systems, stochastic modeling for nonlinear and non-Gaussian systems, data assimilation, uncertainty quantification, rare and extreme events, reduced order models, statistical control, multiscale analysis.
Data science side: machine learning, information theory, causal analysis, statistical inference, data-driven methods.
Atmosphere and ocean science side: El Nino-Southern Oscillation (ENSO), Arctic Sea ice, Madden-Julian oscillation, Monsoon --- Focusing on new dynamical and statistical models, state estimation, prediction, and building NEW mathematical tools to improve the current understanding of nature.
Engineering, neural science, biology and others: damage models and rare events in material science, excitable medium, biological systems, and COVID-19 modeling ...

I have established a general nonlinear stochastic modeling framework, known as the conditional Gaussian nonlinear system (CGNS), for understanding and predicting complex turbulent systems exploiting solvable conditional statistics. Despite the conditional Gaussianity, the CGNS succeeds in capturing many key nonlinear and non-Gaussian features of nature. Its unique feature with the solvable conditional statistics facilitates the development of rigorous mathematical analysis and efficient numerical algorithms to resolve many theoretical and practical issues associated with complex nonlinear systems. On the theoretical side, the CGNS advances the first rigorous quantification of the uncertainty and information barrier in Lagrangian data assimilation. The result provides crucial theoretical guidelines for designing appropriate data assimilation strategies with drifter observations. The CGNS also allows a systematic analysis of model errors, state estimation skills, and various strategies in building approximate models, which all have practical significance. On the computational side, the CGNS allows the development of an efficient algorithm for solving a wide class of high-dimensional Fokker-Planck equations. This is a significant result and profoundly impacts many associated tasks of complex turbulent systems, such as data assimilation, uncertainty quantification, and prediction. The work has been published in the PNAS. Recently, a systematic procedure has been incorporated into the CGNS framework that further allows it to be a fast preconditioner and a cheap surrogate or reduced-order model for complex nonlinear systems. Due to the broad impact in both the applied and computational aspects, the work was selected as the editor’s pick in Chaos. On the more applied side, the CGNS has become a powerful modeling tool to characterize and forecast many natural phenomena. It has been used to develop the first nonlinear physics-constrained low-order model in accurately reproducing the strong non-Gaussian statistics and predicting extreme events of the monsoon and Madden-Julian oscillation (MJO). The CGNS extends the predictability of these extremely turbulent phenomena with appropriate uncertainty quantification. The works have been published in top research journals, including Geophys. Res. Lett. These works played a significant role in building the connections between applied math and atmosphere and ocean science communities. Other important work that has been done considering the CGNS includes stochastic parameterizations, efficient data assimilation, and model identification in the presence of only partial observations.

My significant scientific work also includes developing novel effective models and methods for solving challenging realistic problems that traditional methods or models cannot easily handle. In a recent work published in npj Clim. Atmos (a top Nature partner journal), I built the first multiscale stochastic model that captures the El Nino complexity, which is a critical and hot topic with significant societal impacts. Along this direction, I built a more complicated model (with a set of nonlinear stochastic PDEs) that recovers more detailed spatiotemporal structures of the El Nino complexity, allowing it to be used for regional forecasts. The work has been published in a prestigious modeling journal J. Adv. Model. Earth Syst. Based on these models and the strong need to understand the El Nino prediction from the community, I have further developed an information-theoretic framework to quantify the forecast uncertainty and predictability of general complex systems and then applied it to the El Nino complexity. Besides the El Nino, I have built a new type of model under Lagrangian coordinates, called the discrete element model (DEM), for Arctic Sea ice. Based on such a model, new efficient algorithms for uncertainty quantification, parameter estimation, and data assimilation have been developed to study sea ice and general particle systems. Recently, I have built the first multiscale model that combines the traditional continuum model with the DEM to improve the modeling of Arctic Sea Ice. The model facilitates efficient computations and allows the development of a new hybrid data assimilation strategy to better estimate the state of the coupled sea ice, atmosphere, and ocean system. I have also exploited these models and well-designed stochastic methods to bridge gaps in the observation network by recovering the blurred satellite images and inferring the multiscale features of the unobserved ocean flows. These research works provide extremely useful data sets for understanding the Arctic dynamics and, more generally, the multi-decadal variability. In addition, I have established an intracounty model for studying COVID-19 infection with human mobility. The work, published in PNAS, has a strong impact and provides crucial suggestions for designing regionalization-based policies to mitigate the spread of COVID-19.

Besides modeling and model-related methodology development, I also worked on critical practical topics where data plays a significant role. Recently, I have designed a new strategy for launching drifter observations, which is the first work in this direction that considers the essential contribution of uncertainty in strategy development. Information theory was utilized to rigorously quantify such uncertainty. I have also combined physical models with machine learning to improve modeling, data assimilation, and forecast. Particularly, a physics-informed auto-learning framework was recently developed to derive conceptual models for natural phenomena. In addition, I have built systematic strategies for stochastic reduced-order models with wide applications in various areas.

Several of my research papers have significant broader impacts. In addition to the CGNS work that was selected as the editor’s pick, my work of discovering the Atlantic zonal mode (published in Geophys. Res. Lett.) was reported by more than 20 media as a new weather model to spot massive monsoons coming. My intracounty model for studying COVID-19 infection has also been widely selected by media outlets.

Finally, I wish to highlight that I have worked with my graduate students and postdocs in publishing papers in top research journals, including PNAS, J. Adv. Model. Earth Syst., J. Comput. Phys., Physica D, and SIAM journals, where my students and postdocs were the first or corresponding authors. I have provided my students and postdocs with research opportunities for cutting-edge applied math and interdisciplinary topics and mentored them systematically to work toward the new findings.

Below is a small portion of my research work. These topics aim to provide examples that cover different angles of my research. See my publication page for more details!

Stochastic Dynamical Multiscale Models for El Nino Complexity Capturing Key Non-Gaussian Statistics

El Nino-Southern Oscillation (ENSO) exhibits diverse characteristics in spatial pattern, peak intensity, and temporal evolution. We developed a three-region multiscale stochastic model to show that the observed ENSO complexity can be explained by combining intraseasonal, interannual, and decadal processes. The model starts with a deterministic three-region system for the interannual variabilities. Then two stochastic processes of the intraseasonal and decadal variation are incorporated. The model can reproduce not only the general properties of the observed ENSO events, but also the complexity in patterns (e.g., Central Pacific vs. Eastern Pacific events), intensity (e.g., 10-20 year reoccurrence of extreme El Ninos), and temporal evolution (e.g., more multi-year La Ninas than multi-year El Ninos). While conventional conceptual models were typically used to understand the dynamics behind the common properties of ENSO, this model offers a powerful tool to understand and predict ENSO complexity that challenges our understanding of the 21st-century ENSO. Following this direction, we have developed an intermediate coupled model for ENSO complexity, which involves more dynamical properties. Now we are developing coupled stochastic dynamical multiscale ENSO-MJO models to display more rich features of the coupled system!

We have also developed an auto-learning framework that lets the computer automatically and systematically derive ENSO (and other) conceptual models using different state variables and dimensions. This allows us to understand the key components and mechanisms of ENSO (and other phenomena) and discover a minimum model. The framework can be combined with data assimilation when only partial observations are available or when prior knowledge is limited (in which case the latent variables are essential).

These are the first dynamical statistical models, which are different from the traditional modeling approaches. These models have important applications in improving forecasts, state estimations, and non-Gaussian features of these phenomena. See my publication page for more details.

These models also serve as the bridge to connect applied math and ocean science communities.

CGKN: A Deep Learning Framework for Modeling Complex Dynamical Systems and Efficient Data Assimilation

Conditional Gaussian Koopman Network (CGKN) is a set of nonlinear stochastic neural different equations, aiming to enhance the skill of machine learning (ML) models for both forward problems (forecast) and inverse problems (data assimilation (DA), inference, etc.)
DA loss with analytic formulae is incorporated into the ML training to help improve the interdependence between state variables. It avoids sampling errors in the ensemble DA methods, which prevents them from being used in ML training. Information theory can be used in the DA loss to highlight the role of uncertainty in state estimation. A nonlinear version of the Koopman operator is incorporated into the neural differential equations to compensate for approximation errors and facilitate developing efficient nonlinear DA schemes (e.g., with analytic formulae) in the latent space to improve the state estimation of intermittency and extreme events.

Modeling, Data Assimilation, and Prediction of Arctic Sea Using Sea Ice Floes + Eddy Identification

Sea ice is a complex media composed of discrete interacting elements of various sizes and thicknesses (floes), and at sufficiently small length scales it cannot be approximated as a continuous media as routinely done at large scales. While the Eulerian data assimilation is a relatively mature field, techniques for assimilation of satellite-derived Lagrangian trajectories of sea ice floes remain poorly explored. In a series of work, we developed simple DEM models (and used more complicated version from our collaborators) and developed new efficient Lagrangian data assimilation schemes for recovering the unobserved ocean field, dynamical interpolation of missing floe trajectories, parameter estimation, and superfloe parameterizations of sea ice. We have also developed new probabilistic methods for eddy identification! These methods are validated on the synthetic data but are also applied to the massive real observational data! These works should be of great interest to not only the sea ice community but also many other computational and applied math research areas.

Attribution of Heterogeneous Stress Distributions in Low-Grain Polycrystals under Conditions Leading to Extreme Damage Events

In polycrystalline metals grain boundaries, microstructural deformation bands and interactions with dislocations determine site selection for damage nucleation and must be quantified. The statistical distribution of these features together with the local heterogeneous stress conditions (also determined by material structure) will dictate when, where, and how fast local failure events take place. Computationally inexpensive physics-assisted statistical models are needed to reveal key microstructural features as effective predictors for damage and rapidly forecast these extreme events. The new model is developed based on the following procedure. (a) Leverage physical knowledge to hypothesize a broad set of microstructural factors influencing stress conditions. (b) Apply causal inference to reveal the predominant features causing extreme damaging events with physical explanations. (c) Exploit a conditional Gaussian mixture model to quantify the uncertainty not readily explained by these features. Example: BCC tantalum, from single- to octu-crystal configurations. The causal inference successfully identifies essential physics in the model, which allows the model to provide an effective forecast with appropriate uncertainty quantification (UQ).

Information Theory and Causal Inference: Optimal Design of Deploying New Observations

Deploying Lagrangian drifters that facilitate the state estimation of the underlying flow field within a future time interval is practically important. However, the uncertainty in estimating the flow field prevents using standard deterministic approaches for designing strategies and applying trajectory-wise skill scores to evaluate performance. In this paper an information measurement is developed to quantitatively assess the information gain in the estimated flow field by deploying an additional set of drifters. This information measurement is derived by exploiting causal inference. It is characterized by the inferred probability density function of the flow field, which naturally considers the uncertainty. Although the information measurement is an ideal theoretical metric, using it as the direct cost makes the optimization problem computationally expensive. To this end, an effective surrogate cost function is developed. It is highly efficient to compute while capturing the essential features of the information measurement when solving the optimization problem. Based upon these properties, a practical strategy for deploying drifter observations to improve future state estimation is designed. Due to the forecast uncertainty, the approach exploits the expected value of spatial maps of the surrogate cost associated with different forecast realizations to seek the optimal solution. Numerical experiments justify the effectiveness of the surrogate cost. The proposed strategy significantly outperforms the method by randomly deploying the drifters. It is also shown that, under certain conditions, the drifters determined by the expected surrogate cost remain skillful for the state estimation of a single forecast realization of the flow field as in reality.

Machine Learning + Data Assimilation + Uncertainty Quantification

We aim to combine machine learning, data assimilation, and uncertainty quantification. This includes several topics. On the one hand, we focus on improving the forecast model in data assimilation using machine learning. This is not simply by replacing the physics-based model with a neural network. Instead, machine learning assists the physics-based model in facilitating efficient forecast and analysis algorithms. On the other hand, we design strategies to build end-to-end machine learning data assimilation approaches. The map provides essential information about the quantification of posterior uncertainty. The methods are applied to challenging geophysical modeling problems with high-dimensionality, strong non-Gaussian features and rich dynamical and statistical properties. We have also developed various data assimilation methods, including data assimilation with constraints.

Lagrangian-Eulerian Multiscale Data Assimilation (LEMDA), Parallel Computing and Hybrid Strategy

Lagrangian trajectories are widely used as observations for recovering the underlying flow field via Lagrangian data assimilation (DA). However, the strong nonlinearity in the observational process and the high dimensionality of the problems often cause challenges in applying standard Lagrangian DA. In this paper, a Lagrangian-Eulerian multiscale DA (LEMDA) framework is developed. It starts with exploiting the Boltzmann kinetic description of the particle dynamics to derive a set of continuum equations, which characterize the statistical quantities of particle motions at fixed grids and serve as Eulerian observations. Despite the nonlinearity in the continuum equations and the processes of Lagrangian observations, the time evolutions of the posterior distribution from LEMDA can be written down using closed analytic formulae. This offers an exact and efficient way of carrying out DA, which avoids using ensemble approximations and the associated tunings. The analytically solvable properties also facilitate the derivation of an effective reduced-order Lagrangian DA scheme that further enhances computational efficiency. The Lagrangian DA within the framework has advantages when a moderate number of particles is used, while the Eulerian DA can effectively save computational costs when the number of particle observations becomes large. The Eulerian DA is also valuable when particles collide, such as using sea ice floe trajectories as observations. LEMDA naturally applies to multiscale turbulent flow fields, where the Eulerian DA recovers the large-scale structures, and the Lagrangian DA efficiently resolves the small-scale features in each grid cell via parallel computing.

Intracounty Modeling of COVID-19 Infection with Human Mobility

The COVID-19 pandemic is a global threat presenting health, economic, and social challenges that continue to escalate. Metapopulation epidemic modeling studies in the susceptible-exposed-infectious-removed (SEIR) style have played important roles in informing public health policy making to mitigate the spread of COVID-19. These models typically rely on a key assumption on the homogeneity of the population. This assumption certainly cannot be expected to hold true in real situations; various geographic, socioeconomic, and cultural environments affect the behaviors that drive the spread of COVID-19 in different communities. What’s more, variation of intracounty environments creates spatial heterogeneity of transmission in different regions. To address this issue, we develop a human mobility flow-augmented stochastic SEIR-style epidemic modeling framework with the ability to distinguish different regions and their corresponding behaviors. This modeling framework is then combined with data assimilation and machine learning techniques to reconstruct the historical growth trajectories of COVID-19 confirmed cases in two counties in Wisconsin. The associations between the spread of COVID-19 and business foot traffic, race and ethnicity, and age structure are then investigated. The results reveal that, in a college town (Dane County), the most important heterogeneity is age structure, while, in a large city area (Milwaukee County), racial and ethnic heterogeneity becomes more apparent. Scenario studies further indicate a strong response of the spread rate to various reopening policies, which suggests that policy makers may need to take these heterogeneities into account very carefully when designing policies for mitigating the ongoing spread of COVID-19 and reopening.

(I wanted to highlight that the stochastic parameterization tools I used/developed in many other work plays a crucial role in the model here!!)

Conditional Gaussian Nonlinear Systems (CGNS): Effective Modeling, Multiscale Data Assimilation, Stochastic Parameterizations, and Neural Differential Equations

The conditional Gaussian nonlinear systems (CGNS) is a class of nonlinear and non-Gaussian stochastic differential equations (SDEs), which has wide applications in various disciplines. Recently, the framework has been combined with neural ODEs.

Many complex nonlinear dynamical systems fit into the modeling framework. Some well-known classes of the models are physics-constrained nonlinear stochastic models (for example the noisy versions of Lorenz models, low-order models of Charney-DeVore flows, and a paradigm model for topographic mean flow interaction), stochastically coupled reaction-diffusion models in neuroscience and ecology (for example stochastically coupled FitzHugh-Nagumo models and stochastically coupled SIR epidemic models), and multiscale models for geophysical flows (for example the Boussinesq equations with noise and stochastically forced rotating shallow water equation). This modeling framework has been exploited to develop realistic systems for the Madden-Julian oscillation and Arctic Sea ice.

In addition to modeling many natural phenomena, the CGNS framework and its closed analytic data assimilation formulae have been applied to study many theoretical and practical problems. The framework has been utilized to develop a nonlinear Lagrangian data assimilation algorithm, allowing rigorous analysis to study model error and uncertainty. The analytically solvable data assimilation scheme has been applied to the state estimation and the prediction of intermittent time series for the monsoon and other natural phenomena. Notably, the efficient data assimilation procedure also helps develop a rapid algorithm to solve the high-dimensional Fokker-Planck equation. The classical Kalman-Bucy filter is the simplest special example of the CGNS.

It is also worth highlighting that the ideas of the CGNS modeling framework and the associated data assimilation procedure have been applied to a much wider range of problems. Examples include developing cheap solvable forecast models in dynamic stochastic superresolution, building stochastic superparameterizations for geophysical turbulence, and designing efficient multiscale data assimilation schemes. All these facts indicate that the CGNS provides a useful building block for many practical methods.

Efficient Statistically Accurate Algorithms for Solving High-Dimensional Fokker-Planck Equation

Solving the Fokker-Planck equation for large-dimensional complex turbulent dynamical systems with highly intermittent non-Gaussian features is an important and practical issue. We have developed efficient statistically accurate algorithms for solving both the transient and the equilibrium solutions of Fokker-Planck equations associated with high-dimensional nonlinear dynamical systems with conditional Gaussian structures. These systems are highly nonlinear and have strong non-Gaussian features for intermittency and rare/extreme events. A hybrid strategy is involved in these efficient statistically accurate algorithms. An extremely efficient parametric method based on data assimilation in a large dimension phase space is combined with a kernel method in a small dimension phase space.

Both numerical tests and rigorous analysis demonstrate that the efficient statistically accurate algorithms are able to overcome the curse of dimensionality. It is also shown with mathematical rigour that the algorithms are robust in long time provided that the system is controllable and stochastically stable.

The simplest version of our method can handle systems with dimension O(10) using only L = O(100) sample trajectories (left panel below). In light of a judicious block decomposition (and statistical symmetry if applicable), we are able to extend the method to systems with much larger dimensions, e.g., O(1000) or more (right panel below).

These algorithms will be very useful in understanding prediction, extreme events and causality issues.

Predicting Madden-Julian Oscillation (MJO) Through Physics-Constrained nonlinear Stochastic Models

The dominant mode of tropical intraseasonal variability is the Madden-Julian Oscillation (MJO) which is a slow moving planetary scale envelope of convection propagating eastward typically from the Indian Ocean through the Western Pacific. The MJO effects tropical precipitation, the frequency of tropical cyclones, and extratropical weather patterns. Understanding and predicting the MJO is a central problem in contemporary meteorology with large societal impacts.

The prediction of large-scale MJO is achieved in two steps:

Step 1. A recent advanced nonlinear time series technique, Nonlinear Laplacian Spectral Analysis (NLSA) is applied to the cloudiness data (with ~50,000 dimensions in space and ~70,000 data points in time) to define two spatial modes associated with the boreal winter MJO. NLSA requires no ad hoc detrending or spatial-temporal filtering of the full data set and captures both intermittency and low frequency variability. The resulting time series for the two spatial modes of the MJO are highly intermittent with large variation in amplitude from year to year. The two large-scale MJO-like cloud patterns coinciding in time with the two boreal winter MJOs observed during the TOGA-COARE of 1992-1993 (See the movie below).

Step 2. Physics constrained nonlinear low-order stochastic models are developed.

The model contains two observed MJO variables and two hidden variables that characterize the strong intermittency and random phases of the MJO indices. The model involves correlated multiplicative noise defined through energy conserving nonlinear interaction. The model simulations capture the non-Gaussian features of observations in a nearly perfect way.

The special structure of the model allows an efficient data assimilation algorithm to determine the initial values of two hidden variables that faciliates the ensemble prediction scheme. The skillful prediction results extend the forecast range using low-order models and determine the predictability limits of the MJO indices. In addition to the ensemble mean prediction, the ensemble spread is an accurate indicator of forecast uncertainty at long lead times.

The framework is also applied to predicting the large-scale features of monsoon. Recently, we also developed an effective and practical spatiotemporal reconstruction algorithm, which overcomes the difficulty in most data decomposition techniques with lagged embedding that requires extra information in the future beyond the predicted range of the indices. The predicted spatiotemporal patterns often have comparable skill as the indices.

Data Assimilation with Noisy Lagrangian Tracers: Information Barrier, Model Error, and Practical Strategies

Lagrangian tracers are drifters and floaters following a parcel of fluid's movement. Data assimilation with Lagrangian tracers is an important inverse problem that aims at recovering the underlying velocity field with observations (from tracers). Combining the information in the underlying dynamics and observations serve to reduce error and uncertainty.

Due to the complexity and highly nonlinear nature of Lagrangian data assimilation, there was little systematic analysis based on rigorous theory. Recently, we developed an analytically tractable nonlinear filtering framework for Lagrangian data assimilation, which allows the study of random incompressible/compressible flow field with full mathematical rigor.