Type of Thesis
This thesis discusses how to optimize computational physics software for speed through maximizing the use of novel architectural features of current CPUs. Specifically, the optimization of the Particle-In-Cell (PIC) algorithm is considered. The PIC algorithm is widely use in the study of plasmas, rarified gases, fluid dynamics, and gravitational dynamics. For this algorithm, the main performance bottleneck is the deposition of charge onto the grid. The goals of the optimizations described in this paper were to maximize cache reuse to overcome memory bandwidth limitations and improve data processing speeds by taking advantage of multithreading and vector instructions. The particular techniques, discussed in detail in the paper, were sorting particles into tiles, operating on tiles in a manner that prevents race conditions, separating out the vectorizable operations (also known as strip mining), and sorting particles based on cells. Performance analysis results are given at each step of the optimization. An overall improvement of a factor of 20 in computational speed was obtained.
Rimel, David, "CPU Optimization of Particle Deposition in PIC Simulation Code" (2016). Undergraduate Honors Theses. 1237.