Skip to content
stefan fluck edited this page Feb 22, 2023 · 12 revisions

Hints Regarding Domain Creation

Grid Dimensions

Choosing valid domain extents can be tricky. There are a few rules that need to be followed, which are explained next.

Limitations on nx and ny

When choosing the extents of the domains, keep in mind that the resulting number of gridpoints in x and y direction (nx and ny) must match the processor grid that results due to your choice of number of cores. What does that mean? When assigning 24 processor entities (cores, PE) to a domain, the domain will be split into 24 subdomains, one for each processor. With 24 cores, the resulting processor grid will be a 6 by 4 grid (whose dimensions are called npex and npey). Therefore, your nx and ny respectively must be integer divisible by npex and npey. Furthermore, which is a bit more tricky is the fact that (nxy+1)/npexy must hold with the result containing a factor 2n with n >= 2. Choose basic domain extents, check the distribution of cells with the chosen resolution. Optimize the core count per domain based on their factorization (npex and npey).

With wrong nx and ny values, the multigrid solver cannot work correctly and may become unstable.

There is a python script palm_gf that should simplify the grid finding process. What will mostly work best is a processor grid that has 2n values in x and y direction (32 cores = 4x8, 64 cores = 8x8, 16 cores = 4x4, 8 cores = 2x4, ...). So use nx and ny that are powers of two and also core numbers that are powers of two.

To get maximum usage out of speedflyer, use for example the following number of cores (not exhaustive):

# Domains Domain 1 Domain 2 Domain 3 Domain 4 Domain 5
1 64
2 64 32
3 32 32 32
4 32 32 16 16
5 32 32 16 8 8

Dynamic Driver Generation with INIFOR

PALM can be forced with COSMO-Data. PALM is shipped with a routine called INIFOR, which interpolates boundary conditions from hourly COSMO-DE data for the lateral and top boundary, furthermore, it creates an initial condition volume field for the domain. Also, geostrophic wind components are extracted from the COSMO pressure field, as well as a surface pressure value.

With its orientation towards data available in Germany, INIFOR is intended to be used with COSMO-DE data, which is available in 2 km resolution and with a specific naming of the variables. Swiss COSMO-Data, which was made available so far with the ZHAW, is in a different format and in 1 km resolution. Apart from this, the variable naming does not seem to be consistent anyway, which requires some modifications to the COSMO-Files before being able to use them in INIFOR. To convert them into INIFOR-readable format, two scripts for two different conventions have been written (routines cosmo2inifor). Their operation is detailed below. After conversion, INIFOR can be run, this process is also outlined below.

cosmo2inifor

cosmo2inifor is a bash script, that converts COSMO-1 Data to INIFOR-appropriate format. Two revisions exist, which support two different variable conventions:

Name Use with
cosmo2inifor_revC lafYYYYMMDDHH.nc files in swiss format
cosmo2inifor_revD specific typoe of file we dealt with at the ZHAW (MT)
cosmo2inifor_revE basically revC, but at some point a HEIGHT coord was introduced i.s.o HHL.

Should you encounter a COSMO-Output format that cannot be processed with one of the above routines, a specifically tailored one needs to be set up or can be derived from the above routines.

These routines make use of

  • nco (4.8.1): ncks, ncrename, ncatted, ncap2
  • netcdf (4.6.1): ncdump

Using cosmo2inifor

  1. On the ZAV cluster (Twins), these modules can be loaded with the command module load nco/4.8.1 netcdf/4.6.1-gnu.

  2. Put the files, that are to be converted, into a directory with the cosmo2inifor routine. If you simulate a time span of 1200Z-1600Z, you need the files with results of the hours 12, 13, 14, 15 and 16.

  3. Run the script with ./cosmo2inifor_revX. It will extract a hhl.nc, soil.nc and for each hour a segYYYYMMDDHH-flow.nc and segYYYYMMDDHH-soil.nc file.

Special Case _revD: As this routine is designed to work with a specific meteotest format that the ZHAW received on a one-time-only basis, there are some more restrictions on its use. For instance, HHL and Soiltype information is available in a separate file instead of it being included in 00Z files. Put the Files containing HHL and soiltype (meteotest_..._hhl/soiltyp_...00.nc) into a subdirectory called hhlsoil. The filenames are hardcoded at the moment, so change them in the cosmo2inifor_revC file (hhlfile and soilfile).

INIFOR

INIFOR will interpolate (bilinear, as of April 2020) boundary conditions from the COSMO data. It requires a soil.nc, hhl.nc and for each hour a -flow.nc and a -soil.nc file, which need to be available in a single location. It also requires a namelist containing information of the parent domain (the one that will be forced). It can look like the following:

&inipar nx = 127, ny = 127, nz = 96,
        dx = 32.0, dy = 32.0, dz = 32.0,
    /
&d3par    end_time = 86400.0
    /

Under &inipar, grid information of the parent is given. Under &d3par the endtime is given. This namelist file is automatically generated when running make_static.py, taking all necessary information from the corresponding namelist for the make_static routine. Coordinates are taken from the static-driver, so it has to be ensured that this information is correct. Furthermore, even though the static-file origin_time attribute is unused for now, it will be used in the future, so it should be correctly set as well. It has the format YYYY-MM-DD HH:MM:SS +ZZ.

Load required modules to run PALM before running INIFOR. INIFOR is executed as follows in the bash shell:

inifor -p <path> -d <YYYYMMDDHH (in UTC)> -i <init-mode> -n <namelist> --input-prefix <input-prefix> -t <static-driver> -o <output> -a <averaging-angle>

Hints / Troubleshooting:

  • Date format: YYYYMMDDHH. Start time stamp is used, it know how far to process from the end_time parameter in the namelist.
  • input-prefix is seg, if the files were processed with cosmo2inifor. Standard is laf.
  • Averaging angle -a might not work if the is below 0.2. So far, no real changes were visible when going from an averaging angle of 0.1 to 0.01.
  • If a segmentation-fault occurs during processing, create a copy of the .palm.config.twins18 and recompile it with palmbuild -c twins18(copy). This seems to recompile INIFOR too. Usually it works again after that. Negative averaging angles also lead to a segmentation fault.

VAPOR Visualization

VAPOR is a visualization tool for scientific 3D data. From VAPOR 3 and above, netCDF files can be imported easily, if they conform to the CF-1.7 standard. This requires to have an "axis"-attribute for each coordinate variable (e.g.axis:Xfor the x variable, and so forth for xu, y, yv, zu_3d, zw_3d, zs_3d). This was missing until recently in PALM, this axis is now automatically added (from March 2020 onwards). Before, it could be added with ncatted -O -a axis,x,o,c,"X" <file> for example.

In order to have terrain in the 3D, file this information needs to be copied from the static driver to the 3D-File. This can be achieved with the following command:

ncks -A -v <var> <static-file> <3d_output_file>

The variable would be zt in that case. In order to also include lad, copy that variable as well. Make sure to assign the axis:Z attribute to the lad variable. The LAD can be displayed as grid volumes in VAPOR, which is not beautiful, but at leasts gives a hint about tree positions. In the same way, buildings_3d can be included in the visualization.

Bear in mind, that this is currently only easily doable for netcdf-3 output on the PALM side (netcdf_output_format == 3).

VAPOR works on the basis of vdc files. NetCDF files can be converted into vdc files with a series of VAPOR command line tools - however, I have not managed to complete the whole process. It should be doable with the commands (in this order) cfvdccreate <outputfile> <file.vdc>, which creates a header file. Then, cf2vdc outputfile file.vdc should populate the file with data from the nc-file.


PALM Simulation Lessons Learned

The following questions are written down here for a reason. They are important questions to ask yourself when simulating with PALM.

  • Are the static driver time stamps, end_times and skip_time_data_output parameters in the namelists correct?
  • Are you simulating the correct day?
  • Have you supplied -a "d3#/r restart" when you want to continue your run after? If not, its all lost.
  • Have you checked that output variables are really only supplied once, especially the _av ones? (These arrays are set up after skip_time_data_output, if you skip large parts of your simulations, an error is raised and the simulation is lost)
  • A simulation can be started with palmrun ...... -v > logfile &. With this, the output is piped into a logfile and with & the command is executed in a subshell. This means, that one can log out of the cluster without the palmrun routine being aborted. However, this does not protect the task from all kinds of aborts sent its way. To be totally save, start a case with nohup palmrun ...... -v > logfile &. This really ensures that even an unexpected logout does not kill the process.
  • When monitoring the simulation progress over a VPN connection with tail -f <logfile>, it might occur that the simulation aborts with the message -> palmrun finished if the VPN session is aborted due to the computer entering energy saver mode or something else. It has proven more robust by monitoring the percentage of the simulation progress with spot checks, and await the completion of the job by monitoring the Ganglia load monitor. This might be only applicable to cases started without nohup in the palmrun call, but it was not tested.

NCO

# overwrite attribute grid_mapping in variable SOILTYP (if global, write global) by a c(haracter) entry
ncatted -O -a grid_mapping,SOILTYP,o,c,'rotated_pole' soil.nc #to change grid mapping attribute in var SOILTYP to rotated_pole
ncatted -O -a origin_time,global,o,c,'2019-06-25 12:00:00 +02' example_static #to change origin_time attribute manually


#cut a file according to those dimensions given
ncea -d y_1,-0.5,0.5 -d x_1,-1.0,0.0 filein fileout

# add a value to a variable. -O is overwrite, -s is for an arithmetic expression.
ncap2 -O -s ‘time=time+39600’ <in> <out>

# copy variables from one into another file
ncks -A -v <var> <static-file> <3d_output_file>

# add an attribute to a variable
ncatted -O -a <attname>,<var>,<mode mostly o>,<dtype (c for char)>,"<value>" <file>

# concatenate output files to one large file (only works without errors if equal variables are present)
ncrcat -v var1,var2,var3 inputfiles outputfiles
# as an example with multiple input files and all variables:
ncrcat yv-jor-2_av_3d_N02.00{0..5}.nc yv-jor-2_av_3d_N02.nc

# concatenate all files in a folder with increasing cycle number
# !!! Caution: Files with many timestamps (eg. highres _ts_ files or probe output) may take a long time! Also make sure to temporarily move files that dont need concatenation to another directory and move them back afterwards.

for f in *.nc; do echo ${f//\.*}; done | sort | uniq | 
    while read prefix; do ncrcat "$prefix".*nc "$prefix".nc; done
    
# select only certain variables and concat them (assuming 10min output, first output at 600s, but you want only every hour)
ncks -d time,5,107,6 -v wspeed,wdir,ta in.000.nc in_subset.000.nc
ncks -d time,5,71,6 -v wspeed,wdir,ta in.001.nc in_subset.001.nc
ncrcat in_subset.00* in_subset.nc
rm in.00*

CDO

Regridding

Make a gridfile (target.grid) of the form:

gridtype   =     lonlat
xsize      =     500
ysize      =     200
xfirst     =     5.7
xinc       =     0.01
yfirst     =     45.70
yinc       =     0.01

and run cdo remapbil,target.grid input output.

Industry Standard seems to be remapcon (conservative), remapbil is bilinear remapping.

Subset some variables into a new file

To take a couple of variables and put it into a new file (e.g. if remapping fails due to errors related to single coordinates), you can run

cdo select,name=ASWDIR_S,ATHD_S laf2019060702.nc out.nc

Example: get SW and LW in data from lafXX.nc files to put as boundary condition into the dynamic file.. Possible procedure goes as follows:

  • For interesting timesteps: subset ASWDIR_S and ATHD_S to a separate file.

  • ncrcat 060{0702..0800}.nc jor1.nc them together to one file with a time series.

  • remap the new file to a grid that equals the parent static driver extents with cdo remapbil,grid.grid jor1.nc jor1rmp.nc.

  • Use xarray to create an average over the spatial dimensions (ds.ASWDIR_S.mean(dim=[lon','lat']))

    >>> import xarray as xr
    >>> ds = xr.open_dataset('jor1rmp.nc')
    >>> ds
    <xarray.Dataset>
    Dimensions:    (lat: 60, lon: 80, time: 25)
    Coordinates:
      * time       (time) datetime64[ns] 2019-06-07T02:00:00 ... 2019-06-08T02:00:00
      * lon        (lon) float64 6.563 6.564 6.565 6.566 ... 6.639 6.64 6.641 6.642
      * lat        (lat) float64 46.73 46.73 46.73 46.73 ... 46.78 46.78 46.78 46.79
    Data variables:
        ASWDIFD_S  (time, lat, lon) float32 ...
        ATHD_S     (time, lat, lon) float32 ...
        ASWDIR_S   (time, lat, lon) float32 ...
    Attributes:
        CDI:             Climate Data Interface version 1.9.5 (http://mpimet.mpg....
        Conventions:     CF-1.6
        history:         Thu May 14 13:30:56 2020: cdo remapbil,grid.grid jor1b.n...
        source:          model: cosmo, production_status: unknown, version: 106, ...
        institution:     MeteoSwiss
        ConventionsURL:  http://cfconventions.org/cf-conventions/v1.6.0/cf-conven...
        NCO:             netCDF Operators version 4.8.1 (Homepage = http://nco.sf...
        CDO:             Climate Data Operators version 1.9.5 (http://mpimet.mpg....
    >>> dsa = ds.mean(dim=['lon', 'lat'])
    >>> dsa
    <xarray.Dataset>
    Dimensions:    (time: 25)
    Coordinates:
      * time       (time) datetime64[ns] 2019-06-07T02:00:00 ... 2019-06-08T02:00:00
    Data variables:
        ASWDIFD_S  (time) float32 0.0 0.0 1.1035653 19.519888 ... 0.0 0.0 0.0 0.0
        ATHD_S     (time) float32 346.2963 339.63437 322.497 ... 352.97894 327.9394
        ASWDIR_S   (time) float32 0.0 0.0 0.0034005991 5.844116 ... 0.0 0.0 0.0 0.0
    >>> dsa.to_netcdf('joravg.nc')
  • Add this time series to the dynamic driver with

    import xarray as xr
    import numpy as np
    import os
    import matplotlib.pyplot as plt
    os.chdir('C:\\Users\\Stefan Fluck\\Desktop\\')
    
    dyn = xr.open_dataset('radtest_dynamic', decode_times = False)
    raddata = xr.open_dataset('joravg.nc')
    
    time_rad = dyn.time.values
    lwdata = raddata.ATHD_S.values
    swdirdata = raddata.ATHD_S.values
    swdifdata = raddata.ATHD_S.values
    
    rad_lw_in = xr.DataArray(lwdata, dims = 'time_rad', coords={'time_rad':time_rad} )
    rad_sw_in = xr.DataArray(swdirdata, dims = 'time_rad', coords={'time_rad':time_rad} )
    rad_sw_in_dif = xr.DataArray(swdifdata, dims = 'time_rad', coords={'time_rad':time_rad} )
    
    rad_lw_in.attrs['long_name'] = 'longwave downdwelling radiation'
    rad_lw_in.attrs['lod'] = 1
    rad_lw_in.attrs['units'] = 'Wm-2'
    rad_sw_in.attrs['long_name'] = 'shortwave downdwelling radiation'
    rad_sw_in.attrs['lod'] = 1
    rad_sw_in.attrs['units'] = 'Wm-2'
    rad_sw_in_dif.attrs['long_name'] = 'shortwave diffuse downdwelling radiation'
    rad_sw_in_dif.attrs['lod'] = 1
    rad_sw_in_dif.attrs['units'] = 'Wm-2'
    
    dyn['rad_lw_in'] = rad_lw_in
    dyn['rad_sw_in'] = rad_sw_in
    dyn['rad_sw_in_dif'] = rad_sw_in_dif
    
    dyn.to_netcdf('radtest_withrad_dynamic')

GDAL

Fill empty values in a raster with values from another raster

Example: swissAlti3d data close to the swiss borders contains only empty values outside of swiss territory. Idea: fill those areas with data from coarser NASA SRTM data. Problem: inhomogeneous resolutions and CRS (swissAlti3D is LV95 and 2x2m, NASA SRTM is WGS84 and 30x20m around Switzerland).

How to:

  • reproject NASA SRTM Dataset to LV95 (QGIS: Raster -> Projections -> Transform)
  • cut NASA SRTM Dataset to a bit larger than the swissAlti3D subset (QGIS: Raster -> Extract -> Cut Raster to Extents)
  • resample NASA SRTM Dataset to swissALTI3D-Resolution (QGIS: right click on layer -> Save as -> set Resolution in the correct boxes)
  • With installed QGIS: Open osgeo4w shell. Without: Install GDAL Command Line Tools under Windows, or do it in Linux. If gdal_merge is not available, run py3_env first, which loads another environment where gdal_merge should be available.
  • gdal_merge -o <outfilename> <infile> <infile2> does the job. Here, the input files are being put on top of each other, later mentioned files are put on top of earlier ones. Resolution information comes from the first one. Output datatype is guessed from the file ending of the outfilename (.tif is your best bet).

For every QGIS operation there is a gdal equivalent, multiple steps can be done with e.g. gdal_translate. The command line is mighty: why point at things in a GUI while you could simply say what you want to do.

Merge multiple geotiff raster files using gdal_translate

Example: Terrain data can be downloaded via the swisstopo website, but only as separate tiles. You can specify a rectangle in their map viewer and you can create a csv, containing all download links required to get all tiles that intersect with your drawn rectangle. These will have to be merged to one large geotiff raster file.

  1. download all data using the created csv file, with the command wget -i <filename>
  2. Open the Osgeo4W Shell that comes with QGIS
  3. enter py3_env, which loads more functions of gdal
  4. navigate to your downloaded raster files. If they are on another drive, in the windows shell, you'll have to enter e.g. e: first, then you can cd around on that drive.
  5. to merge them, we'll go via a vrt file. Run the command gdalbuildvrt out.vrt <*inputfiles>. The input files start with ch.*, in the swisstopo case.
  6. To get a tiff out of the vrt-File, run gdal_translate out.vrt out.tif, and it should take a few moments to complete. If it is wished to change the resolution as well, one can provide the option -tr 1 1 in the above command, which changes the target resolution to 1 meter in N and E directions.

QGIS Preprocessing Operations

Subsetting a Shape File

In order to select certain features of a shape file based on field values the "select by expression" operation in QGIS can be used. There are multiple ways to achieve the same thing, but one straight forward way is described below.

Open the attribute table and click on the "Select by Expression" button (Ctrl+F3). In the mask that opens, enter your expression in the form on the left. If you want to select by multiple expressions, separate them by OR (with AND the search will fail, as the OBJEKTART cannot equal two different values in the example).

Once the desired features are selected, right click on the layer export the selected objects by selecting the following buttons.

image-20200520210604404

In the new mask, chose your export settings as desired and export your subset of data.

Another way would be as follows (see section Resolved Forests):

  1. Export the file as it is under a new name.
  2. Select Objects by expression
  3. Create query for OBJEKTART != 6 AND OBJEKTART != 12 AND OBJEKTART !=13. This selects all objects that are not vegetation.
  4. Delete the selected features and save the file again. Your new file now only contains your desired features.

Merge Raster Layers

Some raster data, such as SRTM DHM or DHM or DOM data from the canton of Zurich, it is available as multiple tiles. It is often necessary to merge those tiles to one cohesive dataset in order to process it further.

This can be done with the following action:

image-20200520212714236

Then, the following mask pops up:

Select the layers that are to be merged into a single file. Select also a location, where the new file will be saved. Start the process with the Start button.

HINT: If the raster merge operation does result in a raster with only fill values or zeros (possible in QGIS 3.4), there is a problem with the underlying numpy installation in your QGIS. This link provides more insight into this problem. In essence, you'll need to execute the OSGeo4W.bat file as adminstrator. In the appearing console, run python -m pip show numpy, which will probably return Version: 1.12.1+mkl. If it does so, run

python -m pip uninstall numpy
>> Successfully uninstalled 
python -m pip install numpy
>> Succesfully installed numpy-1.15.4

This should fix your numpy installation within QGIS/OSGeo4W-Project.

Merge Vector Layers

Vector layers can be merged into one file. Before performing this operation, make sure that they share the same feature types (lines, points or polygons).

Vector layers can be merged by using the following operation in QGIS:

image-20200520151248984

This opens the following mask (example from QGIS 3.8):

Procedure:

  1. Select the layers that are to be merged.
  2. Select a location and file name, where the merged dataset is to be saved. $
  3. Start the operation.

Note: This operation can also be performed in batch mode (button on the bottom left).

Field Calculator

With the field calculator, values of a field (e.g. HEIGHT_TOP) can be updated based on the values of other fields. This can for example be used when setting the values for the field RADIUS for specific street types and other operations.

In order to use this feature, click on the Field Calculator button in an open attribute table.

This opens the following mask (example from QGIS 3.4):

Here, it can be decided if a new field is to be generated or an existing field is updated. In the text box on the left, a statement can be written that is executed. In the example, a Switch-Case structure is implemented. Executing this code snippet sets the HEIGHT_TOP value to 10 if the OBJEKTART equals 6 and so on. Queries that are not fulfilled are not executed, fields are filled with Null values.

Some Guidance / Examples from Practice:

  • A possible mapping of swissTLM3D street OBJEKTART types to RADIUS might look as follows. The mapping is already written in the style that is required by the field calculator.

    IMPORTANT: This mapping must serve as a starting point and must be checked before applying!

    CASE
    	WHEN "OBJEKTART" = 8 THEN 5
    	WHEN "OBJEKTART" = 2 THEN 12.5
     	WHEN "OBJEKTART" = 6 THEN 5
    	WHEN "OBJEKTART"= 10 THEN 3
    	WHEN "OBJEKTART"= 9 THEN 4
    	WHEN "OBJEKTART"= 11 THEN 2
    	WHEN "OBJEKTART"= 15 THEN 1.5
    	WHEN "OBJEKTART"= 16 THEN 1
    	WHEN "OBJEKTART"=20 THEN 5
    	WHEN "OBJEKTART"=21 THEN 7.5
    	WHEN "OBJEKTART"=0 THEN 10
    	WHEN "OBJEKTART"=1 THEN 10
    END
  • For railway lines, a RADIUS of 1.5 meters may be appropriate.

Buffer

With the buffer function in QGIS, point and line features can be blown up into polygonal structures. QGIS has an implemented algorithm that allows us to prescribe a radius for each feature as an attribute to the feature itself. There are a couple of alternatives to this operation, there is also a GDAL variant available, the same operation can also be done in geopandas, which is most probably preferrable when dealing with large data. QGIS seems to struggle when operations need to be performed on huge datasets.

The buffer function can be found here:

image-20200520202640402

This opens the following mask (QGIS 3.4):

image-20200520202603670

First, the desired dataset that is to be puffered needs to be selected. The RADIUS attribute can be chosen by clicking on the highlighted buttons. Select a location, where the file shall be saved, and start the algorith by clicking on "Starte" or "Run".

This results in the following state:

image-20200520203224604

It can be seen, that the radius is applied to both sides of the line. Hence, a tree line that has a width of 10 meters requires a radius of 5 meters. The cap styles of the resulting polygon can be modified with the corresponding parameter in the mask.

Hint: The GDAL implementation may be able to perform this buffering based on another field as well, but the options for GDAL tools are rather hard to understand from the documentation. Another simple way to buffer a line feature would be through geopandas with st.geometry = st.geometry.buffer(st.radius), where st represents a previously imported geopandas geodataframe.

Zonal Statistics

Zonal statistics is a powerful tool that can be leveraged to calculate some statistics of raster pixel values that lie within a designated vector polygon. This is especially useful if you know the positions of building footprints, but do not have their height. The same applies to trees, where you may know their position and width, but not their size. For this approach, a preprocessed difference raster between a LIDAR DTM and a LIDAR DOM is required. The following examples are illustrated using the example of Yverdon airfield.

For the city of Yverdon and Zurich, there are digital terrain and surface models available with resolutions of 0.5 or even 0.25 meters. As a reminder, a digital terrain model (DTM) contains the ground heights including buildings and trees, a DOM does only contain the height of the underlying terrain. The latter is often deduced from the former. For us, we are leveraging the availability of the two files and subtract the DOM data from the DTM data. The result are the heights of the surface mounted objects.

image-20200520211555925

DOM subtracted from the DTM. The brighter a pixel is, the higher it is. The extract shows the airfield of Yverdon with its tree rows.

In the above image, it is clearly visible that the trees are very high around the airfield of yverdon. The buildings, in contrast, are quite low and only appear in a dark grey. Black represents a 0 value.

For Yverdon, there is a dataset available that contains the positions of building footprints. An overlay with the above image looks as follows:

image-20200520212051670

The DTM-DOM data with a building footprint overlay (red squares).

Now, it is possible to calculate the statistics for the pixel values that are enclosed by the overlaid polygons. This can be done with the "zonal statistics" algorithm, which is a raster analysis tool. However, in some QGIS versions (e.g. 3.4), this tool cannot be found by looking in the usual spot. It needs to be searched for in the "Toolbox" that can be made visible with Ctrl-Alt-T, if it is not already visible. In this toolbox, search for "zonal" or "Zonenstatistik".

This results in the following window:

On the left side, select your raster that shall be analyzed and the polygon layer that defines the zones. Select the statistics that shall be calculated within each zone based on the raster values. The resulting statistics will appear as new fields for each feature, prepended by the defined prefix (here _). It makes sense to compute the following statistics:

  • median
  • mean
  • min
  • max

Start the algorithm with the corresponding button. Once it is finished, this will have added four new fields in the attribute table.

It is visible, that the selected feature has a median height of 4.7 meters. One could now go ahead and set this value to as HEIGHT_TOP with the following method:

  1. Activate layer editing mode
  2. In the quick edit bar, select the field to be changed and to which value (or field) it shall be changed.
  3. Select, whether you would like to overwrite HEIGHT_TOP with the _median value for every feature or only the ones you have activate.

Now you have successfully populated your building footprint dataset with (estimated) building height information.

Practical Information: It may not be practicable to use the raw median or mean value burnt into the polygons as HEIGHT_TOP value, as the LIDAR DTM-DOM dataset does not contain relevant values at every pixel within the designated zones by the shape file. Therefore, a lot of zero values may bias the statistics process to lower values than they would be in reality. So taking the median may not really work well for trees, tree lines or forests, but may work for buildings as their position is rather exact in the datasets and does not change much over time. Trees, on the other hand, grow and the buffered shape files may not reflect the true positions accurately.

This issue can be worked around by taking the _max value and subtract a fraction of it to account for tree crown shape. A possible approach to calculate the HEIGHT_TOP value from the _max value may be the following:

HEIGHT_TOP = _max - ( 0.15 * _max )

A thorough test of the heights of the buildings, trees or forests in the area of interest is still required and explicitly recommended.

PALM Output data post-processing

In this section, some general hints and code snippets are laid out, which can be used for postprocessing of PALM results.

ncview on Linux and Windows

Initially, results of a simulation can easily be visualized with ncview, which runs on Linux. If you have data on a Windows machine, there are multiple ways to inspect netCDF results, some are more and some are less tedious. Personally, I have WSL2 installed on my Windows machines. WSL2 is a Windows subsystem for Linux, which is natively supported by windows and lets you run a linux virtual machine on your Windows PC. Find installation instructions here. When setting up WSL, you can select your individual linux distro, e.g. Ubuntu. There, you can simply install ncview how you normally would. I use it through the terminal program MobaXTerm, which takes care of X-server window forwarding for you. WSL in combination with MobaXTerm is a great way to have a Linux environment available to you on a Windows machine.

Postprocessing in Python

You can use Python to create beautiful representations of your PALM results. Here are a few hints that will help you produce some nice colorful fluid dynamic plots.

There are many ways how to plot your netcdf results in python. One simple way is with the package xarray, which can open netcdf dataset, keep all attributes, slice datasets and already provides some plotting tools.

You can e.g. open a dataset with:

import xarray as xr
ds = xr.open_dataset(r'C:\Path\to\file.nc').squeeze()

Before I do anything with the data, I perform some modifications on the dataset coordinates. The x and y coordinates are referencing the domain origin, which starts at 0. However, when I want to plot the area around certain coordinates, I would want to specify the coordinates directly and explicitly. The same with the time: The ds.time starts at 0 in the output. The origin info is all included in your static driver, where you can get the data from.

static = xr.open_dataset(r'C:\Path\to\file.nc')
origin = (static.origin_x, static.origin_y)
time_origin = np.datetime64(static.origin_time[:-4]) #turn the string excl. the timezone info to a datetime object
time_origin = time_origin + np.timedelta64(2, 'h') #take care of timedelta to UTC - here 2 for UTC+2, you could parse that from the attributes as well
ds['time'] = ds.time + time_origin 

Now, the time reflects real timestamps. If you now want to select a certain time stamp, or even slice it, you can do that with:

selected_time = '2019-06-25 22:00'
ds = ds.sel(time=selected_time)


#or do a slice: 

starttime = '2019-06-25 22:00'
endtime = '2019-06-26 04:00'
ds = ds.sel(time=slice(starttime, endtime))

For plotting, you can then e.g. just create an ugly plot straight out of xarray:

ds['bio_pet*_xy'].sel(time=selected_time).squeeze().plot() #squeeze gets rid of unnecessary dimensions

or you could do something more nice and do a contourf plot with matplotlib, e.g. with:

# data
selected_time = '2019-06-27 04:00'
pet = dsds['bio_pet*_xy'].sel(time=selected_time).squeeze()

# make the plot
fig = plt.figure(figsize=(21/2.54,14.8/2.54))    #define the plot size in cm
ax = fig.add_axes([0.01, 0.175, 0.98, 0.75])     #add axes for the plot itself
ax.set_aspect('equal')                           #make x and y the same "size", so it's geometrically accurate

petscale = np.array([18,23,29,35,41])

datestring = selected_time[:-6]
timestring = selected_time[-5:]

petplot = pet.interpolate_na(dim='x').plot.contourf(ax = ax, levels=petscale, cmap='jet', add_colorbar=False, extend='both')  
ax.set_xticks([]) # don't set ticks for beauty reasons
ax.set_yticks([])
ax.set_ylabel('') #also no labels, as in a city you can navigate anyway mostly
ax.set_xlabel('')
ax.set_title("PET on the "+datestring[8:10]+'.'+datestring[5:7]+'.'+datestring[0:4]+' at '+timestring, loc='center')

# make a colorbar
ax2 = fig.add_axes([0.2,0.12,0.6,0.02])
cob = fig.colorbar(petplot, cax=ax2, label='PET [°C]', ticks=petscale, orientation='horizontal')
for label in cob.ax.xaxis.get_ticklabels()[1::2]:    #disable every second label for space reasons
    label.set_visible(False)


# plot geodata over it
import geopandas as gpd
trees = gpd.read_file('path to file', bbox=(4 coordinates here of bounding box, see gpd docs))
trees.plot(ax=ax, edgecolor='Green', facecolor="none", linewidth=0.5, alpha = 1, zorder=98)

#finish plot
ax.set_ylim([origin[1], origin[1]+(ds.y.max() - ds.y.min())])
ax.set_xlim([origin[0], origin[0]+(ds.x.max() - ds.x.min())])
plt.tight_layout()

The same structure works also for wind pictures etc. If you have u and v data in your results, you can plot quivers on top of your image. Just add a line after plotting your ds['wind magnitude variable'], e.g with:

quiver_density = 10 #plots a quiver only every 10th grid point - you need to thin it out, otherwise you only have arrows. the scale factor below also influences the arrow size.
ax.quiver(u.x.values[1::quiver_density], #x coords
          u.y.values[1::quiver_density], #y coords
          u.values[1::quiver_density,1::quiver_density], #u component on x-y coords
          v.values[1::quiver_density,1::quiver_density], #v component on x-y coords
          scale=20)

Double check that the quiver makes sense (as always though, check what you plot).

Create a gif

You can create a gif of your results. For that, save your plots (iterate over the time variable for instance) with an increasing number in the filename (use str(iterating_variable).zfill(2) for example, so index 3 becomes 03 in the file name and the order is correct in the end in the gif). In Linux, you can use the imagemagick library, and in there the convert method. Assuming your files are called image_01.png, image_02.png etc, you can create a gif with the following code snippet:

convert -resize 50% -delay 100 -loop 0 image_*.png testgif.gif

Use the resizetag to avoid creating huge files, the delay option specifies the speed of the gif (100 * 10ms = 1s here). loop should define the looping characteristics of the gif, but I don't really get how it works.