Documentation of the Changes in the COSMO-Model
Version 4.25

26.09.2012

4.25 hopefully is one of the last big steps to COSMO-Model 5.0. Several bigger modifications have been implemented, e.g. the new tracer module, the possibility for asynchronous NetCDF I/O and a modified and optimized asynchronous GRIB(1) I/O.

Implementation of new Tracer Module
Implementation of 2-moment microphysics - Interfaces
Optimizations and cleanup in the asynchronuous GRIB IO module: mpe_io2.f90
Implementation of new asynchronous netcdf I/O strategy
Implementation of another option for the global communication in the output
Update of the Multi-Layer Snow Model
Usage of the field soiltyp
Limiting the ntag-value for the boundary exchange
IFS Convection scheme for the CLM
New Grib values
Bug Fixes and Technical Changes
Changes to the Namelists
Changes of Results

1. Implementation of new Tracer Module

(by Anne Roches, Oliver Fuhrer)

A generic handling of Tracers has been implemented into the COSMO-Model. This handling is already implemented for the microphysics tracers: qv, qc, qi, qr, qs and qg

A detailed documentation of how to use this new tracer module will be published soon as a COSMO Technical Report.

Back to Contents

2. Implementation of 2-moment microphysics - Interfaces

(by Uli Blahak)

Only the interfaces to the 2-moment scheme (together with some other necessary changes) have been implemented to the COSMO-Model using ifdef TWOMOM_SB (two-moment scheme from Seifert and Beheng). If this parameter is not defined during compilation, the model does not change at all.

To get the source code and further documentation of the code, please contact the colleagues at DWD.

Back to Contents

3. Optimizations and cleanup in the asynchronuous GRIB IO module: mpe_io2.f90

(by Florian Prill)

The module mpe_io.f90 serves a number of GRIB1 I/O related purposes in the COSMO-Model.

Write operation: This can be done in the
- traditional operation mode, where I/O is done by the first compute processes
- or it can be truly asynchronous, where dedicated I/O processes receive data from the compute processes and perform writing without blocking the computations.
Read operation: The reading of GRIB files is inherently sequential, thus it is always conducted by a single I/O process. In principle, the reading process is a non-blocking operation for the compute PEs. However, it may constitute a barrier, when input data is actually required for the next compute step.
Writing Ready-files: The mpe_io implementation supports the use of ready>-files, i.e. small files indicating the completion of an output step. Ready-files are used within DWD's NWP suite to handle interdependencies of programs.
Database support: mpe_io.f90 could be configured that read- and write-operations could be directly done from / to DWD's relational Oracle data base. But this feature has never been used because of performance problems.

There have been several changes now to the asynchronuous GRIB1 I/O module mpe_io.f90. Because of these changes also some interfaces have been modified. Therefore we decided to choose a new name for this module: It is now called mpe_io2.f90 (other models at DWD still use mpe_io.f90).

The modifications are:

Removed database support: The Namelist group /DATABASE/ in INPUT_IO is not necessary any more.
Added a "pre-fetching mode": pre-fetching strives to avoid blocking of the compute PEs due to reading boundary data. In this node, boundary data are read ahead of time, i.e. when the forthcoming I/O operation wil be the input of a GRIB file. This can then be performed simulanteously with the preceding compute steps.
Ready-files: bug correction: Ready-files are now only written, if all files of an output step have been written.

New Namelist parameter in /IOCTL/:

Group	Name	Meaning	Default
`/IOCTL/`	`lprefetch_io`	Enables reading of boundary files ahead of time, i.e. when the forthcoming I/O operation will be reading a GRIB file, then this can be done simultaneously with the preceding compute steps. Prefetching can only be enabled with true asynchronous I/O. (`lasync_io=.TRUE.`)	`.FALSE.`

Back to Contents

4. Implementation of new asynchronous netcdf I/O strategy

(by Carlos Osuna)

An asynchronous solution for output of NetCDF files has been implemented. By namelist configuration, asynchronous I/O for NetCDF can be enabled reserving some dedicated I/O PEs that will receive output levels from compute PEs and record them (asynchronously) into disk.

The switch lasync_io is used to enable the asynchronous mode. There are two new namelist switches, to control the configuration of the asynchronous Netcdf I/O in the group /RUNCTL/:

Group	Name	Meaning	Default
`/RUNCTL/`	`num_asynio_comm`	To choose the number of asynchronous I/O communicators for NetCDF. With several communicators it is possible to parallelize the output over the files to be written (the `GRIBOUT` namelists).	`0`
`/RUNCTL/`	`num_iope_percomm`	To choose the number of asynchronous I/O processes per communicator for NetCDF I/O. With several processes per communicator it is possible to do a parallel writing of single files. This is only possible, if the parallel NetCDF library is available and the code has been compiled with the preprocessor directive `-DPNETCDF`	`0`

The number of asynchronous I/O PEs is then computed by nc_asyn_io = num_asynio_comm * num_iope_percomm

If both namelist switches are > 0, and lasync_io=.TRUE., asynchronous NetCDF I/O is selected as I/O strategy. Note, that the code for sequential netcdf I/O (lasync_io=.FALSE.) has not been modified.

NOTE:

Asynchronous NetCDF and asynchronous GRIB are NOT compatible. Therefore if lasync_io=TRUE, all the output files have to be either in NetCDF or in GRIB format!
num_iope_percomm > 1 is only allowed if parallel NetCDF is available. Then, the code has to be compiled with the preprocessor flag -DPNETCDF.

Using num_iope_percomm=1 and nc_asyn_io=1 may give good performance results in most of the cases where the amount of data being written is moderate.

Back to Contents

5. Implementation of another option for the global communication in the output

(by Oliver Fuhrer)

A switchable gathering of vertical levels (2D fields) on the compute PEs has been implemented. Up to now a communication is started for every level. The new option allows to gather nproc levels at the same time (where nproc is the number of processors used).

To choose the desired option, a new Namelist switch itype_gather has been implemented in the group /IOCTL/:

Group	Name	Meaning	Default
`/IOCTL/`	`itype_gather`	To choose the type of gathering output fields: gather the fields by an extra communication per level (default) gather fields by one communication for nproc levels (with MPI_ALLTOALLV)	`1`

The new option has been tested using debug options to give bit-reproducible results for a run with original code, a run with new code using itype_gather=1 and a run with new code using itype_gather=2. The benefit of using this optimization for a full COSMO-2 24h forecast on a Cray is a decrease of the total simulation time by 8%.

On the NEC SX-9 with using only 16 CPUs, no significant changes can be seen between the two options (Uli Schaettler).

Back to Contents

Update of the Multi-Layer Snow Model

6. Update of the Multi-Layer Snow Model

(by Ekaterina Machulskaya)

To avoid numerical instability caused by very low snow surface temperatures in the case of very thin snow heights, a minimal critical value of snow height is introduced which depends on the timestep and on the heat fluxes through the upper and lower boundaries of snowpack. If snow height is smaller than this critical value, the multi-layer snow model switches to an algorithm similar to single-layer snow model.

The correction of the surface outgoing long-wave radiation flux with due regard for the low frequency of the radiation routine calls is introduced. The call of the subroutine "normalize" is eliminated in order to reduce computational costs. Its functionality is transferred to the subroutine "terra_multlay"

Back to Contents

7. Usage of the field soiltyp

(by Oliver Fuhrer)

This is a REAL field, but contains values that should be INTEGERs. In several routines comparisons with integer values are done. All these comparisons are now implemented with NINT (soiltyp)

Back to Contents

8. Limiting the ntag-value for the boundary exchange

All calls to exchg_boundaries contain a tag for the identification of the message. In most calls this tag is connected to the time step of the model and in case of climate runs this could be a very large number.

Some MPI implementations have a strict handling of the maximal value a tag can have and the model is running into problems here. Therefore all calls were changed and the time step ntstep has been changed by a variable nexch_tag, which is set as MOD (ntstep, 24*3600/INT(dt)).

Back to Contents

9. IFS Convection scheme for the CLM

An interface has been implemented for the IFS convection scheme in organize_physics using conditional compilation: ifdef CLM

The source code of the scheme itself is not distributed with the official COSMO versions, but has to be obtained by the CLM community. If the code is compiled with -DCLM, the IFS convection scheme can be activated by setting itype_conv=2.

If itype_conv=2 is chosen, the averaging of the convective forcing cannot be activated, lconf_avg has to be .FALSE. then. If it is .TRUE., the model resets it to .FALSE. again.

Back to Contents

10. New Grib values

Official Grib1 numbers have been given to the variables of the multi-layer snow model: Up to now these variables had been given element numbers from table 250, which is used at MeteoSwiss, and the new vertical level type 211. This level type is now officially accepted by WMO. But the element numbers are now chosen according to the variable, i.e. T_SNOW_MUL has the same element number as T_SNOW, but just a different level type: And all variables just get an '_M' to indicate the multi-layer variable.

For the 2-Moment scheme, one new field for the total vertically integrated hail content has been introduced.

Name	Meaning	iee	itabtyp	ilevtyp
`T_SNOW_M`	Snow temperature	203	201	211
`H_SNOW_M`	Snow depth in m (per layer)	66	2	211
`RHO_SNOW_M`	Snow density in kg / (m**3)	133	201	211
`W_SNOW_M`	Water equivalent of accumulated snow	65	2	211
`WLIQ_SNOW_M`	Liquid water content in the snow	137	201	211
`TQH`	Total hail content vertically integrated	136	201	211

Back to Contents

10. Bug Fixes and Technical Changes

fast_waves_sc.f90:
Format change in namelist-parameter itype_bbc_w: The 'new' nomenclature now has the format 1ed instead of only ed. Otherwise e.g. the case 04 is not properly recognized, but confused with the case 4 of the 'old' nomenclature.
io_utilities.f90, Subroutine make_fn:
When using a time step dt, which does not fit into the I/O time increment (i.e. the I/O step is NOT a multiple of dt), the file name is not determined correct. Some modifications have been done to adjust the file name to the correct time for I/O.
organize_physics.f90:
Prohibit concurrent use of nradcoarse > 1 and nboundlines > 3, which is not working.
Added plausibility checks regarding lrad and itype_aerosols in combination with periodic BCs.
organize_dynamics.f90:
Adapted format of lines in YUSPECIF, so that all variables can properly be written
src_advection_rk.f90:
Better computation of wcon in SR advection_pd (with already advected w) (Michael Baldauf)
New T- and p- advection scheme (internal switch iztype_tppadv=2) only works for odd-order upwind advection schemes (iadv_order=1,3,5), so in case of even advection order, switch back to the old scheme and issue a warning instead of aborting the run (Ulrich Blahak).
src_conv_tiedtke.f90:
Implemented some type conversions (to avoid special compiler warnings)
src_lheat_nudge.f90:
Bug fix for variable min0_lhn (should be known at each processor)
Changes to enable subhourly model starting points (LETKF approach):
- use full information of new get_utc_date routine
- additional call of SR lhn_sumrad if first observation is within current hour (lhm1)
- modifications of obs_time, next_obs_time
Correction of array size of diagnostic variables with respect to local processors.
src_obs_processing.f90:
Patch in order to run with AOF files: without that, in SR src_obs_cdfin_print.f90 the model tries to reopen YUREJECT and YUOBSDR files with a runtime error.
src_output.f90, Subroutine makepds:
The same situation as in make_fn occurs when determining the reference time for an analysis, when using a dt that does not fit into the output increment. The reference time is adapted accordingly now.
src_input.f90:
- Checks, if files are existing: After every call to make_fn, the extension .nc has to be added, in order to properly check the NetCDF files.
- In the make_fn-call for the restart file, ydirini has been replaced (again) by ydir_restart
src_gridpoints.f90, organize_diagnosis.f90:
- Corrected unit of height of convective clouds in (short) meteograph output
- Increased length of station names for grid point output
src_radiation.f90:
src_setup.f90:
Adapted format of lines in YUSPECIF, so that all variables can properly be written
src_sfcana.f90:
The control of writing the surface analysis fields has been adapted to deal with a time step, which does not fit into a full hour. Now, all surface analysis fields should be written.
time_utilities.f90, Subroutine collect_timings:
The output format for the timings have been enlarged. Now even very long times should be displayed properly.

Back to Contents

12. Changes to the Namelists

There were new Namelist variables in the following groups:

Group	Name	Meaning	Default
`/RUNCTL/`	`num_asynio_comm`	To choose the number of asynchronous I/O communicators for NetCDF. With several communicators it is possible to parallelize the output over the files to be written (the `GRIBOUT` namelists).	`0`
`/RUNCTL/`	`num_iope_percomm`	To choose the number of asynchronous I/O processes per communicator for NetCDF I/O. With several processes per communicator it is possible to do a parallel writing of single files. This is only possible, if the parallel NetCDF library is available and the code has been compiled with the preprocessor directive `-DPNETCDF`	`0`
`/IOCTL/`	`lprefetch_io`	Enables reading of boundary files ahead of time, i.e. when the forthcoming I/O operation will be reading a GRIB file, then this can be done simultaneously with the preceding compute steps. Prefetching can only be enabled with true asynchronous I/O. (`lasync_io=.TRUE.`)	`.FALSE.`
`/IOCTL/`	`itype_gather`	To choose the type of gathering output fields: gather the fields by an extra communication per level (default) gather fields by one communication for nproc levels (with MPI_ALLTOALLV)	`1`

Back to Contents

13. Changes of Results

The following changes influence the results:

The implementation of the new tracer module does not change the results, with the exception of the NEC SX-9. Here, very small changes are observed, which might be due to different compiler optimizations.
The implementation of the 2-moment microphysics does not change the results, if the new scheme is not activated.
Bug Fixes in src_radiation.f90
- Wrong index js used for averaging values to fesft input for nradcoarse > 1. This changes the results for COSMO_DE.
- The other modifications cause numerical differences in COSMO_EU and COSMO_DE.
Bug Fix in src_input.f90
Calculation of qrs and rho was missing for initial data. Before, these quantities were used with value 0.0. Now they have a reasonable initialization.
This changes the results for COSMO_DE and COSMO_EU.
Bug Fix in src_advection_rk.f90
Computation of wcon in SR advection_pd: use the already advected w to compute wcon.
Modification in meteo_utilities.f90
Computation of clwc(i,j,k) resp. ql by using a new formula provided by M. Raschendorfer. The old formula lead to problems for special settings of q_crit and clc_diag and could lead to negative values for ql.

Back to Contents