In the last decade, high-order methods have gained increased attention. These combine the convergence properties of spectral methods with the geometrical flexibility of low-order methods. However, the time step is restrictive, necessitating the implicit treatment of diffusion terms in addition to the pressure. Therefore, efficient solution of elliptic equations is of central importance for fast flow solvers. As the operators scale with O(p · N), where N is the number of degrees of freedom and p the polynomial degree, the runtime of the best available multigrid algorithms scales with O(p · N) as well. This super-linear scaling limits the applicability of high-order methods to mid-range polynomial orders and constitutes a major road block on the way to faster flow solvers.
This work reduces the super-linear scaling of elliptic solvers to a linear one. First, the static condensation method improves the condition of the system, then the associated operator is cast into matrix-free tensor-product form and factorized to linear complexity. The low increase in the condition and the linear runtime of the operator lead to linearly scaling solvers when increasing the polynomial degree, albeit with low robustness against the number of elements. A p-multigrid with overlapping Schwarz smoothers regains the robustness, but requires inverse operators on the subdomains and in the condensed case these are neither linearly scaling nor matrix-free. Embedding the condensed system into the full one leads to a matrix-free operator and factorization thereof to a linearly scaling inverse. In combination with the previously gained operator a multigrid method with a constant runtime per degree of freedom results, regardless of whether the polynomial degree or the number of elements is increased.
Computing on heterogeneous hardware is investigated as a means to attain a higher performance and future-proof the algorithms. A two-level parallelization extends the traditional hybrid programming model by using a coarse-grain layer implementing domain decomposition and a fine-grain parallelization which is hardware-specific. Thereafter, load balancing is investigated on a preconditioned conjugate gradient solver and functional performance models adapted to account for the communication barriers in the algorithm. With the new model, runtime prediction and measurement fit closely with an error margin near 5 %.
The devised methods are combined into a flow solver which attains the same throughput when computing with p = 16 as with p = 8, preserving the linear scaling. Furthermore, the multigrid method reduces the cost of implicit treatment of the pressure to the one for explicit treatment of the convection terms. Lastly, benchmarks confirm that the solver outperforms established high-order codes.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:74002 |
Date | 23 February 2021 |
Creators | Huismann, Immo |
Contributors | Fröhlich, Jochen, Sherwin, Spencer J., Technische Universität Dresden |
Publisher | TUDPress |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:doctoralThesis, info:eu-repo/semantics/doctoralThesis, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | 10.1016/j.jcp.2017.06.012, 10.1016/j.compfluid.2019.104386, 10.1007/978-3-319-78024-5_30, 10.1016/j.jcp.2019.108868, 10.1016/j.compfluid.2015.05.012, 10.1007/978-3-319-32152-3_35, 10.1002/pamm.201410465, 10.1002/pamm.201710037, urn:nbn:de:bsz:14-qucosa2-387059, qucosa:38705, 3011512-7 |
Page generated in 0.0022 seconds