The Vanadis model is a high‑performance finite‑element framework for three‑dimensional atmospheric transport and dispersion. The spatial discretization is based on the Directional Residual (DR) method, which provides low‑diffusion advection and strict mass conservation, making it suitable as a reference solution for evaluating stabilized schemes. Element matrices [H] and [C] are assembled on the CPU, while the solution stage employs a fully parallel element‑by‑element strategy without forming a global matrix. This structure maps efficiently onto GPU architectures: the The CUDA implementation accelerates the solution of the linear system using a parallel element‑by‑element strategy inside the solver, taking advantage of the diagonal structure preserved by the DR stabilization.
Boundary conditions at the ground surface include either a Dirichlet constraint or a flux‑type deposition condition, with the latter controlled by the parameter α, which governs pollutant penetration into the surface.
Time integration is performed using a second‑order BDF2–Gear scheme, derived from the dynamic DR formulation. The method provides strong numerical stability and accurate transient behavior for stiff advection–diffusion systems. The combination of DR spatial discretization, GPU‑accelerated element‑wise solving, and the BDF2–Gear time scheme results in a robust and efficient tool for large‑scale atmospheric dispersion simulations.
This solution is currently under patent evaluation. Patent application P.455475 (“pending”) has been filed with the Polish Patent Office. The full documentation will become publicly available once the Office completes the examination and publishes the application in its online database.



Geometry: The model uses linear 8‑node hexahedral elements with standard shape functions Ni, which provide trilinear interpolation of geometry and field variables within each element.
Weighting functions: The weighting functions in the weak formulation are the Directional Residual (DR) functions, ensuring consistency between the residual minimization and the spatial discretization. Unlike Galerkin weighting, DR preserves the diagonal structure required for the element‑by‑element solver.
Shape functions (interpolation): The scalar field S(x,y,z) is interpolated using the standard trilinear basis
Linear equations:
![]()
Steady state solution:

Dynamic solution:
Time integration is performed using a second‑order backward differentiation scheme (BDF2–Gear), which provides robust stability for stiff transport problems and ensures accurate temporal evolution even under rapidly changing flow conditions. This choice of integrator enables the solver to maintain high accuracy without requiring excessively small time steps.


Vanadis runs on Linux or Windows systems and requires a Fortran compiler for building the CPU version of the solver. GPU acceleration is optional and uses the NVIDIA CUDA Toolkit.
For a representative test case consisting of 2.8 million nodes and a 100‑second time step, the following performance was observed:
CPU (Intel Core i5‑4590 @ 3.30 GHz, 4 GB RAM): approximately 20 seconds to solve the linear system for a single time step.
GPU (NVIDIA GTX 1060, 3 GB): approximately 10 seconds to solve the linear system for a single time step.
These results illustrate the computational efficiency of the GPU‑accelerated element‑by‑element solver compared to the single‑core CPU implementation.
For the large grid (≈ 2.8 M nodes), the DR formulation completes the full 2 000‑second transient simulation in 20 minutes, compared to 33 minutes for the internal legacy prototype Adaptive Streamline Penalty (ASP) stabilization. Although both approaches produce nearly identical solution fields, the matrix/vector assembly in ASP remains approximately three times more expensive, while the DR solver is 15–20 % faster on CPU during the initial steps.
In the ASP prototype, the streamline‑penalty coefficient was adaptively scaled with the local Peclet number. This internal method served only as a preliminary stabilization mechanism and has since been fully replaced by the Directional Residual (DR) formulation.


The results are visualized on two orthogonal surfaces: the ground surface and a vertical cross‑section. At time t=0 the first chimney is activated. After 1000 seconds the second chimney is switched on and the wind direction is changed. The influence of the tall building on the flow field and pollutant dispersion is clearly visible.

3D dynamic convection–advection model
Time step: 100 s
Pollution source: one emitter located at (x = 0 m, y = 0 m, h = 60 m)
Domain size: 8000 m × 8000 m × 420 m (height)
Example output: ground‑level concentration after 2000 s (shown on the left)
Data availability: full output for every time step on all visualization surfaces
Adjustable parameters:
Qv [kg/s] — emission rate
Kx, Ky, Kz [m²/s] — turbulent diffusion coefficients
P [1/s] — decay constant
vh [m/s] — vertical wind component
vx, vy [m/s] — horizontal wind components
α [m/s] — deposition / settling parameter
author: Marek Chodorski
e-mail: marek_ac@wp.pl