Example Use Cases#

Computing Turbulence Percentage#

This example will enable you to generate plots, like in Fig. 1, to explore the climatological distribution of turbulence.

userguide/_static/multi_diagnostic_f3d_ti1_on_200_light.png — Fig. 1 Probability of encountering light turbulence during the months December, January, February from 2018 to 2024 at 200 hPa for the three-dimensional frontogenesis (F3D) and turbulence index 1 (TI1) diagnostics.#

The first step in the process is acquiring the calibration ECMWF ERA5 data from CDS. The rojak command below requests 6-hourly data from the 1st and 15th of every month from 1980 to 1989 and places it in the folder met_data/era5/calibration_data. It uses the default configuration for clear air turbulence (CAT) to specify which variables to request, the product type, which pressure levels and the data format.

Attention

This step will download 42GB of data. Moreover, depending on the CDS queue this may several hours.

Listing 1 retrieve-calibration-data-command#

$ rojak data meteorology retrieve -s era5 -y 1980 -y 1981 -y 1982 -y 1983 -y 1984 -y 1985 -y 1986 -y 1987 -y 1988 -y 1989 -m -1 -d 1 -d 15 -n pressure-level -p 175 -p 200 -p 225 --default-name cat -o met_data/era5/calibration_data

Hint

If the above command fails due to the CDS API, check out getting started guide on Meteorological Data: ERA5.

The calibration data is used to compute the threshold values to determine whether light turbulence is present for a given turbulence diagnostic. The next step is to request the data for the evaluation dataset. In Fig. 1, data from the boreal winter (i.e. December, January, and February) of 2018 to 2024 was used.

Attention

This step will download 85GB of data. Moreover, depending on the CDS queue this may several hours to a day.

Listing 2 retrieve-evaluation-data-command#

$ rojak data meteorology retrieve -s era5 -y 2018 -y 2019 -y 2020 -y 2021 -y 2022 -y 2023 -y 2024 -m 12 -m 1 -m 2 -d -1 -n pressure-level -p 175 -p 200 -p 225 --default-name cat -o met_data/era5/evaluation_data

With these datasets, we can now use rojak to run the analyses. Listing 3 is a yaml file that will be used to control the analysis performed by rojak. In this configuration file,

Line 1: Defines the configuration for the input data. Here, it only specifies the spatial domain which the analysis is for. In this case, it is for the entire globe.
Line 11: This is the start of the settings for the CAT analysis. This corresponds to the rojak.orchestrator.configuration.TurbulenceConfig.
Line 12: Within the turbulence config, the first setting that needs to be set is the chunking (see Dask docs on Chunks). This specifies the size of each DASK array chunk. For simplicity, it has been set to the size of the spatial domain
Line 16: Is a list of the turbulence diagnostics which the analysis is to be performed on. Each item in the list must be an string from the enum.StrEnum class rojak.orchestrator.configuration.TurbulenceDiagnostics
Line 19: Defines which phases to run and corresponds to the class rojak.orchestrator.configuration.TurbulencePhases
Line 20: Specifies what should occur during the calibration phase. If applicable, where the calibration data is stored. This corresponds to the rojak.orchestrator.configuration.TurbulenceCalibrationPhases class.
Line 21: Contains the required configuration for the calibration phase, such as where the data is stored in calibration_data_dir. This corresponds to the rojak.orchestrator.configuration.TurbulenceCalibrationConfig. As the thresholds phase is specified in line 29, the percentile thresholds for a given turbulence intensity needs to be provided.
Line 29: Specifies which phases to run during the calibration phase this corresponds to the enum.StrEnum rojak.orchestrator.configuration.TurbulenceCalibrationPhaseOption
Line 32: Contains the required configuration for the evaluation phase, this corresponds to the rojak.orchestrator.configuration.TurbulenceEvaluationConfig
Line 33: Specifies which phases to run during the evaluation phase this corresponds to the enum.StrEnum rojak.orchestrator.configuration.TurbulenceEvaluationPhaseOption. As "probabilities" has been specified, it will use the percentile thresholds computed threshold for the turbulence diagnostics during the calibration phase. The option "edr" will perform the mapping to EDR values using the distribution computed during the calibration phase.

Listing 3 turbulence-probability-config.yaml#

data_config:
    spatial_domain:
        maximum_latitude: 90.0
        maximum_longitude: 180.0
        minimum_latitude: -90.0
        minimum_longitude: -180.0
image_format: eps
name: eighties
output_dir: output
plots_dir: plots
turbulence_config:
    chunks:
        pressure_level: 3
        latitude: 721
        longitude: 1440
    diagnostics:
        - f3d
        - ti1
    phases:
        calibration_phases:
            calibration_config:
                calibration_data_dir: met_data/met_data/era5/calibration_data
                percentile_thresholds:
                    light: 97.0
                    light_to_moderate: 99.1
                    moderate: 99.6
                    moderate_to_severe: 99.8
                    severe: 99.9
            phases:
                - thresholds
                - histogram
        evaluation_phases:
            phases:
                - probabilities
                - edr
            evaluation_config:
                evaluation_data_dir: met_data/met_data/era5/evaluation_data

This configuration can be launched using the command below,

$ rojak run turbulence-probability-config.yaml

To monitor the progress of the process through the Dask Dashboard Diagnostics, go to http://localhost:8787/status.

Note

This configuration was run on the HPC with 920GB of memory. It is possible that it does not require as much. It is likely to require a minimum of 42GB of memory. Moreover, as the default behaviour of dask (see Configuration) is to write to /tmp its ability to spill to disk may be limited to the amount of memory in your system.

EDR Snapshot#

By default, rojak does not produce snapshot plots. This can be achieved through a few different ways. The method shown below is the least involved, thus has the least room for customisation.

userguide/_static/multi_edr_f3d_ti1.png — Fig. 2 6-hour forecast of eddy dissipation rate (EDR) at 200 hPa for the three-dimensional frontogenesis (F3D) and turbulence index 1 (TI1) on the 1st of December 2024 at 00:00#

Instead of performing the analysis globally like in the previous section, the domain can be limited (like in Fig. 2) by specifying it in like in lines 2-6 in Listing 4.

Listing 4 edr-snapshot-config.yaml#

data_config:
    spatial_domain:
        maximum_latitude: 70
        maximum_longitude: 60
        minimum_latitude: 0
        minimum_longitude: -150
image_format: png
name: eighties
output_dir: output
plots_dir: plots
turbulence_config:
    chunks:
        pressure_level: 3
        latitude: 721
        longitude: 1440
    diagnostics:
        - f3d
        - ti1
    phases:
        calibration_phases:
            calibration_config:
                calibration_data_dir: met_data/met_data/era5/calibration_data
            phases:
                - histogram
        evaluation_phases:
            phases:
                - edr
            evaluation_config:
                evaluation_data_dir: met_data/era5/met_data/evaluation_data

Note

The configuration in Listing 4 assumes that the step in Listing 1 and Listing 2 were performed.

This script in Listing 5 uses the rojak.orchestrator.turbulence.TurbulenceLauncher to execute the configuration in Listing 4. It then uses the outcome from the evaluation stage to plot the EDR values in the first time step at 200 hPa.

Listing 5 edr-snapshot.py#

import sys
from pathlib import Path
from typing import TYPE_CHECKING

import cartopy.crs as ccrs
import xarray as xr
from dask.distributed import Client

from rojak.orchestrator.configuration import Context as ConfigContext
from rojak.orchestrator.configuration import TurbulenceEvaluationPhaseOption
from rojak.orchestrator.turbulence import TurbulenceLauncher
from rojak.plot.turbulence_plotter import (
    chain_diagnostic_names,
    create_configurable_multi_diagnostic_plot,
)
from rojak.plot.utilities import (
    StandardColourMaps,
    get_a_default_cmap,
)

if TYPE_CHECKING:
    from rojak.orchestrator.turbulence import EvaluationStageResult

if __name__ == "__main__":
    # Start dask to run in distributed manner
    client: Client = Client()

    # Load config file passed on as first argument
    config_file = Path(sys.argv[1])
    assert config_file.exists()
    assert config_file.is_file()
    assert config_file.suffix in {".yaml", ".yml"}

    # Deserialize data stored in yaml file
    context = ConfigContext.from_yaml(config_file)

    # Launch the turbulence analysis to get the result from the evaluation stage
    eval_stage_result: "None | EvaluationStageResult" = TurbulenceLauncher(context).launch()
    assert eval_stage_result is not None

    # Verify that EDR was computed, if it wasn't check input config
    assert TurbulenceEvaluationPhaseOption.EDR in eval_stage_result.phase_outcomes

    # Get computed EDR from the evaluation stage
    edr = eval_stage_result.phase_outcomes[TurbulenceEvaluationPhaseOption.EDR].result
    names = [str(item) for item in eval_stage_result.suite.diagnostic_names()]

    # Plot the first time step at 200hPa
    create_configurable_multi_diagnostic_plot(
        xr.Dataset(
            data_vars={name: diagnostic.isel(time=0).sel(pressure_level=200) for name, diagnostic in edr.items()}
        ),
        names,
        str(context.plots_dir / f"multi_edr_{chain_diagnostic_names(names)}.{context.image_format}"),
        column="diagnostics",
        plot_kwargs={
            "subplot_kws": {
                "projection": ccrs.LambertConformal(
                    central_longitude=(-45),
                    central_latitude=35,
                )
            },
            "cbar_kwargs": {
                "label": "EDR",
                "orientation": "horizontal",
                "spacing": "uniform",
                "pad": 0.02,
                "shrink": 0.6,
                "extend": "max",
            },
            "vmin": 0,
            "vmax": 0.8,
            "col_wrap": min(3, len(names)),
            "cmap": get_a_default_cmap(StandardColourMaps.TURBULENCE_PROBABILITY, resample_to=8),
        },
        savefig_kwargs={"bbox_inches": "tight"},
    )

    # Close dask client
    client.close()

With the environment where rojak is installed, invoking the following command will run the python script with the config to produce the image in Fig. 2.

$ python edr-snapshot.py edr-snapshot-config.yaml

Example Use Cases#

Computing Turbulence Percentage#

EDR Snapshot#

This Page