Print Version

Ensembles Lessons

Table of Contents

Collapse Menu | Expand Menu

Multimodel Ensembles

Principles

As we mentioned in an earlier lesson, one way to generate an ensemble is to vary the model physics. Adding models with different physics can generate a better spread of plausible weather forecasts. This has lead to efforts to combine different ensembles into multimodel ensemble forecast systems.

At present in the U.S, we have two different multimodel ensembles:

      NAEFS (North American Ensemble Forecast System) combines NCEP's GEFS (Global Ensemble Forecast System) and the Canadian Ensemble Forecast system (CEFS).
      NUOPC (National Unified Operational Prediction Capability) presently combines the GEFS and the Navy's Ensemble Forecast System (NEFS), based on NOGAPS (Navy Operational Global Atmospheric Prediction System) from FNMOC (Fleet Numerical Meteorology and Oceanography Command).

The eventual goal for NUOPC is to combine GEFS, CEFS, and NEFS into one 60-member ensemble. Preliminary verification statistics show that when compared to NAEFS, the resulting ensemble shows noticeable improvement for surface variables but only small improvement for the upper atmosphere.

Verification

Ensemble Mean Error (RMSE, solid) and Spread (dashed) for 2m Temperature, North America, Fall 2012 for 3 EPSs (NCEP GEFS, CMC CEFS, and FNMOC NEFS)

In this graphic, solid lines show root mean squared error (RMSE) for ensemble mean and dashed lines show spread for North American 2-meter temperatures, Fall 2012.

When looking at this diagram, keep in mind a "rule of thumb" for EFSs: the standard deviation or spread should approximately equal the RMSE of the ensemble mean. If the SD is less than the RMSE, too few extreme events will be forecast; if SD is more than the RMSE, too many extreme events will be forecast. We can see here that spread is substantially less than RMSE for all 3 ensembles.

Ensemble Mean Error (RMSE, solid) and Spread (dashed) for 2m Temperature, North America, Fall 2012 for NCEP GEFS, NAEFS (GEFS + CEFS), and Experimental NUOPC (GEFS + CEFS + NEFS)

Now compare the individual ensembles to multimodel ensembles. Here we see GEFS (for reference), NAEFS (GEFS + CEFS), and an experimental NUOPC that combines all 3 ensembles (GEFS, CEFS, and NEFS).

Which ensemble exhibits the lowest RMSE? (Choose the best answer.)

The correct answer is (c) Experimental NUOPC. This is particularly true at short lead times. Near the end of the period, NAEFS and NUOPC converge.

Please make a selection.

Which ensemble exhibits the "best" spread? (Choose the best answer.)

The correct answer is (c) Experimental NUOPC. Note how the spread closely tracks the RMSE for NUOPC. NAEFS, with 40 members, performs better than any of the individual ensembles, but not as well as the combined ensemble with 60 members.

Please make a selection.
Ensemble Mean Error (i.e., Bias, solid) and Mean Absolute Error (dashed) for 2m Temperature, North America, Fall 2012 for 3 EPSs (NCEP GEFS, CMC CEFS, and FNMOC NEFS)

Another graphic for 2-meter temperatures for fall 2012 over the North America shows the mean absolute value of the error (MAE) in dashes, and ensemble mean error or bias in solid lines. The MAE for all three single-model ensembles is approximately equal. However, the biases tell a different story. CEFS and NEFS mean error increases with lead time in a positive direction, while GEFS mean error becomes increasingly negative.

Ensemble Mean Error (i.e., Bias, solid) and Mean Absolute Error (dashed) for 2m Temperature, North America, Fall 2012 for NCEP GEFS, NAEFS (GEFS + CEFS), and Experimental NUOPC (GEFS + CEFS + NEFS)

Here we see the power of combining different models into one EFS. The CEFS high temperature bias combined with GEFS low temperature bias results in nearly no mean bias in NAEFS. The 3-model ensemble has a slightly warmer bias that increases with time. The absolute error in the multimodel ensembles remains similar to that in the single-model ensembles.

Keep in mind that on individual days, the NAEFS and NUOPC ensemble means may show significant random errors, even with the overall improvement.

Overall, preliminary verification studies have shown that adding the FNMOC ensemble to current NAEFS (NCEP+CMC) to create a 3-model ensemble adds value for most forecast variables. This is particularly true for surface variable, like the 2-meter temperature shown here. However there is little improvement for upper atmosphere variables (for example 500-mb heights). Researchers observed some forecast degradation relative to NAEFS for short lead times, most likely related to large spread in NEFS.