Example Use Cases#

Computing Turbulence Percentage#

This example will enable you to generate plots, like in Fig. 1, to explore the climatological distribution of turbulence.

userguide/_static/multi_diagnostic_f3d_ti1_on_200_light.png

Fig. 1 Probability of encountering light turbulence during the months December, January, February from 2018 to 2024 at 200 hPa for the three-dimensional frontogenesis (F3D) and turbulence index 1 (TI1) diagnostics.#

The first step in the process is acquiring the calibration ECMWF ERA5 data from CDS. The rojak command below requests 6-hourly data from the 1st and 15th of every month from 1980 to 1989 and places it in the folder met_data/era5/calibration_data. It uses the default configuration for clear air turbulence (CAT) to specify which variables to request, the product type, which pressure levels and the data format.

Attention

This step will download 42GB of data. Moreover, depending on the CDS queue this may several hours.

Listing 1 retrieve-calibration-data-command#
$ rojak data meteorology retrieve -s era5 -y 1980 -y 1981 -y 1982 -y 1983 -y 1984 -y 1985 -y 1986 -y 1987 -y 1988 -y 1989 -m -1 -d 1 -d 15 -n pressure-level -p 175 -p 200 -p 225 --default-name cat -o met_data/era5/calibration_data

Hint

If the above command fails due to the CDS API, check out getting started guide on Meteorological Data: ERA5.

The calibration data is used to compute the threshold values to determine whether light turbulence is present for a given turbulence diagnostic. The next step is to request the data for the evaluation dataset. In Fig. 1, data from the boreal winter (i.e. December, January, and February) of 2018 to 2024 was used.

Attention

This step will download 85GB of data. Moreover, depending on the CDS queue this may several hours to a day.

Listing 2 retrieve-evaluation-data-command#
$ rojak data meteorology retrieve -s era5 -y 2018 -y 2019 -y 2020 -y 2021 -y 2022 -y 2023 -y 2024 -m 12 -m 1 -m 2 -d -1 -n pressure-level -p 175 -p 200 -p 225 --default-name cat -o met_data/era5/evaluation_data

With these datasets, we can now use rojak to run the analyses. Listing 3 is a yaml file that will be used to control the analysis performed by rojak. In this configuration file,

  1. Line 1: Defines the configuration for the input data. Here, it only specifies the spatial domain which the analysis is for. In this case, it is for the entire globe.

  2. Line 11: This is the start of the settings for the CAT analysis. This corresponds to the rojak.orchestrator.configuration.TurbulenceConfig.

  3. Line 12: Within the turbulence config, the first setting that needs to be set is the chunking (see Dask docs on Chunks). This specifies the size of each DASK array chunk. For simplicity, it has been set to the size of the spatial domain

  4. Line 16: Is a list of the turbulence diagnostics which the analysis is to be performed on. Each item in the list must be an string from the enum.StrEnum class rojak.orchestrator.configuration.TurbulenceDiagnostics

  5. Line 19: Defines which phases to run and corresponds to the class rojak.orchestrator.configuration.TurbulencePhases

  6. Line 20: Specifies what should occur during the calibration phase. If applicable, where the calibration data is stored. This corresponds to the rojak.orchestrator.configuration.TurbulenceCalibrationPhases class.

  7. Line 21: Contains the required configuration for the calibration phase, such as where the data is stored in calibration_data_dir. This corresponds to the rojak.orchestrator.configuration.TurbulenceCalibrationConfig. As the thresholds phase is specified in line 29, the percentile thresholds for a given turbulence intensity needs to be provided.

  8. Line 29: Specifies which phases to run during the calibration phase this corresponds to the enum.StrEnum rojak.orchestrator.configuration.TurbulenceCalibrationPhaseOption

  9. Line 32: Contains the required configuration for the evaluation phase, this corresponds to the rojak.orchestrator.configuration.TurbulenceEvaluationConfig

  10. Line 33: Specifies which phases to run during the evaluation phase this corresponds to the enum.StrEnum rojak.orchestrator.configuration.TurbulenceEvaluationPhaseOption. As "probabilities" has been specified, it will use the percentile thresholds computed threshold for the turbulence diagnostics during the calibration phase. The option "edr" will perform the mapping to EDR values using the distribution computed during the calibration phase.

Listing 3 turbulence-probability-config.yaml#
 1data_config:
 2    spatial_domain:
 3        maximum_latitude: 90.0
 4        maximum_longitude: 180.0
 5        minimum_latitude: -90.0
 6        minimum_longitude: -180.0
 7image_format: eps
 8name: eighties
 9output_dir: output
10plots_dir: plots
11turbulence_config:
12    chunks:
13        pressure_level: 3
14        latitude: 721
15        longitude: 1440
16    diagnostics:
17        - f3d
18        - ti1
19    phases:
20        calibration_phases:
21            calibration_config:
22                calibration_data_dir: met_data/met_data/era5/calibration_data
23                percentile_thresholds:
24                    light: 97.0
25                    light_to_moderate: 99.1
26                    moderate: 99.6
27                    moderate_to_severe: 99.8
28                    severe: 99.9
29            phases:
30                - thresholds
31                - histogram
32        evaluation_phases:
33            phases:
34                - probabilities
35                - edr
36            evaluation_config:
37                evaluation_data_dir: met_data/met_data/era5/evaluation_data

This configuration can be launched using the command below,

$ rojak run turbulence-probability-config.yaml

To monitor the progress of the process through the Dask Dashboard Diagnostics, go to http://localhost:8787/status.

Note

This configuration was run on the HPC with 920GB of memory. It is possible that it does not require as much. It is likely to require a minimum of 42GB of memory. Moreover, as the default behaviour of dask (see Configuration) is to write to /tmp its ability to spill to disk may be limited to the amount of memory in your system.

EDR Snapshot#

By default, rojak does not produce snapshot plots. This can be achieved through a few different ways. The method shown below is the least involved, thus has the least room for customisation.

userguide/_static/multi_edr_f3d_ti1.png

Fig. 2 6-hour forecast of eddy dissipation rate (EDR) at 200 hPa for the three-dimensional frontogenesis (F3D) and turbulence index 1 (TI1) on the 1st of December 2024 at 00:00#

Instead of performing the analysis globally like in the previous section, the domain can be limited (like in Fig. 2) by specifying it in like in lines 2-6 in Listing 4.

Listing 4 edr-snapshot-config.yaml#
 1data_config:
 2    spatial_domain:
 3        maximum_latitude: 70
 4        maximum_longitude: 60
 5        minimum_latitude: 0
 6        minimum_longitude: -150
 7image_format: png
 8name: eighties
 9output_dir: output
10plots_dir: plots
11turbulence_config:
12    chunks:
13        pressure_level: 3
14        latitude: 721
15        longitude: 1440
16    diagnostics:
17        - f3d
18        - ti1
19    phases:
20        calibration_phases:
21            calibration_config:
22                calibration_data_dir: met_data/met_data/era5/calibration_data
23            phases:
24                - histogram
25        evaluation_phases:
26            phases:
27                - edr
28            evaluation_config:
29                evaluation_data_dir: met_data/era5/met_data/evaluation_data

Note

The configuration in Listing 4 assumes that the step in Listing 1 and Listing 2 were performed.

This script in Listing 5 uses the rojak.orchestrator.turbulence.TurbulenceLauncher to execute the configuration in Listing 4. It then uses the outcome from the evaluation stage to plot the EDR values in the first time step at 200 hPa.

Listing 5 edr-snapshot.py#
 1import sys
 2from pathlib import Path
 3from typing import TYPE_CHECKING
 4
 5import cartopy.crs as ccrs
 6import xarray as xr
 7from dask.distributed import Client
 8
 9from rojak.orchestrator.configuration import Context as ConfigContext
10from rojak.orchestrator.configuration import TurbulenceEvaluationPhaseOption
11from rojak.orchestrator.turbulence import TurbulenceLauncher
12from rojak.plot.turbulence_plotter import (
13    chain_diagnostic_names,
14    create_configurable_multi_diagnostic_plot,
15)
16from rojak.plot.utilities import (
17    StandardColourMaps,
18    get_a_default_cmap,
19)
20
21if TYPE_CHECKING:
22    from rojak.orchestrator.turbulence import EvaluationStageResult
23
24if __name__ == "__main__":
25    # Start dask to run in distributed manner
26    client: Client = Client()
27
28    # Load config file passed on as first argument
29    config_file = Path(sys.argv[1])
30    assert config_file.exists()
31    assert config_file.is_file()
32    assert config_file.suffix in {".yaml", ".yml"}
33
34    # Deserialize data stored in yaml file
35    context = ConfigContext.from_yaml(config_file)
36
37    # Launch the turbulence analysis to get the result from the evaluation stage
38    eval_stage_result: "None | EvaluationStageResult" = TurbulenceLauncher(context).launch()
39    assert eval_stage_result is not None
40
41    # Verify that EDR was computed, if it wasn't check input config
42    assert TurbulenceEvaluationPhaseOption.EDR in eval_stage_result.phase_outcomes
43
44    # Get computed EDR from the evaluation stage
45    edr = eval_stage_result.phase_outcomes[TurbulenceEvaluationPhaseOption.EDR].result
46    names = [str(item) for item in eval_stage_result.suite.diagnostic_names()]
47
48    # Plot the first time step at 200hPa
49    create_configurable_multi_diagnostic_plot(
50        xr.Dataset(
51            data_vars={name: diagnostic.isel(time=0).sel(pressure_level=200) for name, diagnostic in edr.items()}
52        ),
53        names,
54        str(context.plots_dir / f"multi_edr_{chain_diagnostic_names(names)}.{context.image_format}"),
55        column="diagnostics",
56        plot_kwargs={
57            "subplot_kws": {
58                "projection": ccrs.LambertConformal(
59                    central_longitude=(-45),
60                    central_latitude=35,
61                )
62            },
63            "cbar_kwargs": {
64                "label": "EDR",
65                "orientation": "horizontal",
66                "spacing": "uniform",
67                "pad": 0.02,
68                "shrink": 0.6,
69                "extend": "max",
70            },
71            "vmin": 0,
72            "vmax": 0.8,
73            "col_wrap": min(3, len(names)),
74            "cmap": get_a_default_cmap(StandardColourMaps.TURBULENCE_PROBABILITY, resample_to=8),
75        },
76        savefig_kwargs={"bbox_inches": "tight"},
77    )
78
79    # Close dask client
80    client.close()

With the environment where rojak is installed, invoking the following command will run the python script with the config to produce the image in Fig. 2.

$ python edr-snapshot.py edr-snapshot-config.yaml