Skip to content

Lagrangian Coherent Structure identification for machine learning

Notifications You must be signed in to change notification settings

ocean-transport/lcs-ml

Repository files navigation

Simulation and Identification of Lagrangian Coherent Structures

1. Background & Motivation

The ocean is an energetic and turbulent environment with motions ranging from scales of a few centimeters to thousands of kilometers. Interactions across these spatial scales are very important in setting up the large-scale circulation of the ocean, as well as transporting and mixing tracer fields (e.g., heat and salinity).

Ocean turbulence is dominated by mesoscale motions, which tend to self-organize into coherent vortices on the order of 100s of kilometers in scale. These Lagrangian Coherent Structures (LCSs) trap and transport fluids over long distances and potentially play an important role in regulating climate. Mesoscale eddies are notoriously difficult to parametrize in coarse resolution ocean models, making their overall contribution to the climate uncertain. Furthermore, identifying LCSs requires careful calculations of vorticity along Lagrangian particle paths. The goals of this project are to simulate and identify LCSs using the Lagrangian-Averaged Vorticity Deviation (LAVD) method. A new labeled dataset of LCSs, PV, and strain fields are produced as a training dataset for machine learning applications.

2. Model Initialization

We use pyqg to simulate a two-layer quasigeostrophic (QG) turbulent system driven by eastward mean shear. We configure the model to mimic the dynamics of the Southern Ocean following the set-up by Zhang et al. (2020). The model is run with a double-periodic domain that is 1200 km on each side and has a horizontal resolution of 512 x 512 grid points. The parameters of the simulation are stored in a config.yml file.

The model is spun up from an initial random state, and after some time, coherent vortices begin to self organize. Below are four snapshots of the upper layer potential vorticity anomaly evolving during the begining of the initial spin up.

The model is spun up for 50 years and saved at monthly (30-day) intervals. There are no leap years, so each year has 365 days.

As the model gets spun up, the mean eddy kinetic energy (EKE) increases until it reaches an equilibrated state. When the EKE plateaus the model is considered to be in a stable state. The time series of EKE in each layer levels off around 50 years.

The equilibrated model state is saved at year 50 and used to initialize the large ensemble of lagrangian particles.

3. Large Ensemble

The Large Ensemble is initialized with the year 50 PV anomaly field from the equilibrated state using the same model configuration. Each ensemble member differs slightly by perturbing the PV anomaly at a single grid cell near the middle of the domain. This randomness is enough for the members to diverge. The perturbation for each ensemble member is dictated by a unique seed number. This ensures that the ensemble members are completely reproducible.

A total of 1,048,576 lagrangian particles are seeded every half grid point and advanced using the gridded u and v velocity fields. In addition to the X and Y position, two scalar quantities are computed along particle paths. They are relative vorticity,

eq ,

and strain magnitude,

eq ,

where eq1 and eq1 are the normal and shear strain respectively, such that

eq

and

eq.

We solve the above equations in spectral space before using the inverse FFT to transform back to the time domain. Spectral computations are faster and computationally more efficient.

The stream function is also saved for the model domain at each time integration.

4. Getting Started

Run and/or modify python scripts:

  1. Fork the lcs-ml repository to your GitHub account.

  2. Clone your fork locally using git.

git clone [email protected]:YOUR_GITHUB_USERNAME/lcs-ml.git
  1. Connect to the upstream repository and create a new branch
cd lcs-ml
git remote add upstream [email protected]:ocean-transport/lcs-ml.git
  1. To fix a bug or add a feature, create your own branch off "main"
git checkout -b new-branch-name main
  1. Set up a conda environment with all the necessary dependencies.
conda env create -f environment.yml
conda activate lcs-ml

If you need some help with Git, follow this quick start guide.

  1. To rerun the control simulation, submit the batch script via slurm. You will need to modifuly the config and output file directories accordingly.
sbatch spin_up.sh
  1. To generate the ensemble, submit this batch script using job array, again modifying file paths to match your directory. In this example, we are generature 40,000 ensemble members.
sbatch --array=1-40000 ensemble_generator.sh

To check the progress of sbatch jobs, you can either check the auto-generated slurm-******.out file or type

squeue -u [$USER_NAME]

Access the labeled dataset:

On Ginsburg

  • Each 90-day ensemble member: /burg/abernathey/users/hillary/lcs/pyqg_ensemble/####.zarr
  • Entire 50-year spin up: /burg/abernathey/users/hillary/lcs/spin_up/spin_up.zarr
  • 50-year snapshot: /burg/abernathey/users/hillary/lcs/spin_up/QG_steady_50.nc

On the Cloud

  • This ensemble will also be available soon on the cloud via Pangeo-forge. More details will be provided here soon.

Load data from Ginsburg

Origonal Project Roadmap

image

About

Lagrangian Coherent Structure identification for machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published