Overview

General comments

The purpose of this code is to run pseudo-spectral DNS of turbulence, and integrate particle trajectories in the resulting fields. An important aim of the code is to simplify the launching of compute jobs and postprocessing, up to and including the generation of publication-ready figures.

For research, people routinely write code from scratch because research goals change to a point where modifying the previous code is too expensive. With TurTLE, the desire is to identify core functionality that should be implemented in a library. The core library can then be used by many problem-specific codes.

The core functionality is implemented in C++, while a Python3 wrapper is typically used for generating initial conditions and light-weight post-processing. The core library uses a hybrid MPI/OpenMP approach for parallelization.

Python3 “wrapper”

In principle, users of the code should only need to use Python3 for launching jobs and postprocessing data.

Classes defined in the Python3 package can be used to generate executable codes, compile/launch them, and then for accessing and postprocessing data with a full Python3 environment. Depending on machine-specific settings, the code can then be launched directly, or job scripts appropriate for queueing systems are generated and submitted.

C++ core library

As already stated, TurTLE’s main goal is to facilitate novel research avenues. To this end, there are two types of simple objects. Firstly, children of an abstract class that encapsulates three elements: generic initialization, do work and finalization functionality. Secondly, essential data structures (i.e. fields, sets of particles) and associated functionality (i.e. I/O) are provided by “building block”-classes. Each TurTLE “solver” then consists of a specific arrangement of the building blocks.

The heterogeneous TurTLE development team benefits from the separation of generic functionality from building blocks: TurTLE is naturally well-suited to the distribution of conceptually distinct work, in particular fully isolating projects such as, e.g., “implement new numerical method” from “optimize the computation of field statistics with OpenMP”.

Equations

The code uses a fairly standard pseudo-spectral algorithm to solve fluid equations. The incompressible Navier Stokes equations in velocity form (see the NSE C++ class) are as follows:

\[\partial_t \mathbf{u} + \mathbf{u} \cdot \nabla \mathbf{u} = - \nabla p + \nu \Delta \mathbf{u} + \mathbf{f}\]

TurTLE also solves the vorticity formulation of these equations (see the NSVE C++ class):

\[\partial_t \mathbf{\omega} + \mathbf{u} \cdot \nabla \mathbf{\omega} = \mathbf{\omega} \cdot \nabla \mathbf{u} + \nu \Delta \mathbf{\omega} + \nabla \times \mathbf{f}\]

Statistics

Basic quantities that can be computed in a pseudospectral code are the following:

\[E = \frac{1}{2} \sum_{\mathbf{k}} \hat{\mathbf{u}} \cdot \hat{\mathbf{u}}^*, \hskip .5cm \varepsilon = \nu \sum_{\mathbf{k}} k^2 \hat{\mathbf{u}} \cdot \hat{\mathbf{u}}^*, \hskip .5cm \textrm{in general } \sum_{\mathbf{k}} k^p \hat{u_i} \cdot \hat{u_j}^*, \hskip .5cm \varepsilon_{\textrm{inj}} = \sum_{\mathbf{k}} \hat{\mathbf{u}} \cdot \hat{\mathbf{f}}^*\]

In fact we store a proxy for the energy spectrum tensor:

\[\sum_{k \leq \|\mathbf{k}\| \leq k+dk}\hat{u_i} \cdot \hat{u_j}^*, \hskip .5cm \sum_{k \leq \|\mathbf{k}\| \leq k+dk}\hat{\omega_i} \cdot \hat{\omega_j}^*\]

The word “proxy” is used because technically there are normalization factors (specifically a Dirac delta) that need to be accounted for.

Conventions

The C++ backend is based on FFTW, and the Fourier representations are transposed. In brief, this is the way the fields are represented on disk and in memory (both in the C++ backend and in Python postprocessing):

  • real space representations of 3D vector fields consist of contiguous arrays, with the shape (nz, ny, nx, 3): \(n_z \times n_y \times n_x\) triplets, where \(z\) is the slowest coordinate, \(x\) the fastest; each triplet is then the sequence of \(x\) component, \(y\) component and \(z\) component.

  • Fourier space representations of 3D vector fields consist of contiguous arrays, with the shape (ny, nz, nx/2+1, 3): \(k_y\) is the slowest coordinate, \(k_x\) the fastest; each triplet of 3 complex numbers is then the \((x, y, z)\) components, as FFTW requires for the correspondence with the real space representations.

We recommend the following procedure to read the corresponding wavenumbers:

import numpy as np
from TurTLE.DNS import DNS

c = DNS(
        work_dir = '/location/of/simulation/data',
        simname = 'simulation_name_goes_here')

df = c.get_data_file()
kx = df['kspace/kx'][()]
ky = df['kspace/ky'][()]
kz = df['kspace/kz'][()]
df.close()
# optional: build full 3D field of k vector
kval = np.zeros(
        kz.shape + ky.shape + kx.shape + (3,),
        dtype = kx.dtype)
kval[..., 0] = kx[None, None, :]
kval[..., 1] = ky[:, None, None]
kval[..., 2] = kz[None, :, None]

Tutorial: basics

First DNS

As a first step, we will run a small \(32^3\) simulation starting from random initial conditions, and have a look at the results.

# change directory to a clean "scratch" folder of your choice:
cd $SCRATCH
# depending on how curious you are, you may have a look at the
# options first:
turtle --help
turtle DNS --help
turtle DNS NSVE --help
# or you may just run it:
turtle DNS NSVE -n 32

Note:

  • Python environment where the TurTLE package was installed must be activated prior to calling the commands.

  • Please use an empty “SCRATCH” folder; in particular running turtle from within the build folder may lead to problems.

This launches a simulation with the simulation name “test”. The simulation itself should not take more than a few seconds, since this is just a \(32^3\) simulation run for 8 iterations. The prior compilation step may take a bit longer. First thing you can do afterwards is open a python console, and type the following:

import numpy as np
from TurTLE.DNS import DNS

c = DNS(
        work_dir = './',
        simname = 'test')
c.compute_statistics()
print ('Rlambda = {0:.0f}, kMeta = {1:.4f}, CFL = {2:.4f}'.format(
        c.statistics['Rlambda'],
        c.statistics['kMeta'],
        (c.parameters['dt']*c.statistics['vel_max'] /
         (2*np.pi/c.parameters['nx']))))
print ('Tint = {0:.4e}, tauK = {1:.4e}'.format(c.statistics['Tint'],
                                               c.statistics['tauK']))
data_file = c.get_data_file()
print ('total time simulated is = {0:.4e} Tint, {1:.4e} tauK'.format(
        data_file['iteration'].value*c.parameters['dt'] / c.statistics['Tint'],
        data_file['iteration'].value*c.parameters['dt'] / c.statistics['tauK']))

compute_statistics will read the data file, and it will compute a bunch of basic statistics, for example the Taylor scale Reynolds number \(R_\lambda\) that we’re printing in the example code.

What happens is that the DNS will have generated an HDF5 file containing specific datasets (spectra, moments of real space representations, etc). The function compute_statistics performs simple postprocessing that may however be expensive, therefore it also saves some data into a <simname>_cache.h5 file, and then it also performs some time averages, yielding the statistics dictionary that is used in the above code.

Behind the scenes

In brief the following takes place:

  1. An instance c of DNS is created. It is used to generate an argparse.ArgumentParser, and it processes command line arguments given to the turtle DNS NSVE command.

  2. Reasonable DNS parameters are constructed from the command line arguments.

  3. c generates a parameter file <simname>.h5, into which the various parameters are written. c also generates the various datasets that the backend code will write into (statistics).

  4. c writes a simple C++ file that is compiled and linked against libTurTLE (to be specific, it calls the main_code function defined in cpp/full_code/main_code.hpp).

  5. c executes the C++ code using the appropriate launcher (e.g. mpiexec).

  6. the C++ code actually performs the DNS, and outputs various results into the <simname>.h5 file. It also outputs the final state of the vorticity field into an appropriate checkpoint file (also in HDF5 format).

After the simulation is done, things are simpler. In fact, any HDF5 capable software can be used to read the data file, and the dataset names should be reasonably easy to interpret, so custom postprocessing codes can easily be generated.

Scaling tests

Our own scaling tests are reported in [lalescu2021arXiv].

Initial scaling data is available at (url coming soon). For now you need to run a preliminary DNS of 8192 iterations using a grid of \(128^3\) points. Please copy the files to the location TURTLE_FIELD_DATABASE.

Separately, please recompile TurTLE with the TIMING_OUTPUT cmake option switched to ON.

Afterwards, please run variations of the following command:

python ${TURTLE_REPOSITORY}/tests/DNS/test_scaling.py D \
    -n 128 \
    --nprocesses 4 \
    --ncores 1 \
    --src-wd ${TURTLE_FIELD_DATABASE} \
    --src-iteration 8192