دانلود رایگان مقاله تبدیل فوریه موازی با متعادل کردن بار امواج صفحه

عنوان فارسی
تبدیل فوریه سریع سه بعدی موازی با متعادل کردن بار امواج صفحه
عنوان انگلیسی
Parallel 3-dim fast Fourier transforms with load balancing of the plane waves
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
7
سال انتشار
2016
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
کد محصول
E998
رشته های مرتبط با این مقاله
فیزیک
گرایش های مرتبط با این مقاله
فیزیک کاربردی
مجله
ارتباطات کامپیوتر و فیزیک - Computer Physics Communications
دانشگاه
آزمایشگاه فیزیک محاسباتی، چین
کلمات کلیدی
اصول محاسبه اول، معادله کوهن شام، موج تخت، FFT، تعادل بار
چکیده

Abstract


The plane wave method is most widely used for solving the Kohn–Sham equations in first-principles materials science computations. In this procedure, the three-dimensional (3-dim) trial wave functions’ fast Fourier transform (FFT) is a regular operation and one of the most demanding algorithms in terms of the scalability on a parallel machine. We propose a new partitioning algorithm for the 3-dim FFT grid to accomplish the trade-off between the communication overhead and load balancing of the plane waves. It is shown by qualitative analysis and numerical results that our approach could scale the plane wave first-principles calculations up to more nodes.

نتیجه گیری

6. Conclusion


We propose a new partitioning algorithm for the 3-dim FFT grid used in the plane wave first-principles calculations. Compared with the greedy algorithm biased toward the load balancing of the plane wave computations, our approach primarily suppresses the growth in communication overhead with respect to an increasing number of processors by realizing local all-to-all communications during data transposes. Then we adjust the data distribution to improve the load balancing with the communication pattern preserved. It has been shown by numerical results that a much lower communication overhead on a relatively large number of processors is achieved at a moderate loss of load balancing. Using the new algorithm, we could scale the plane wave codes up to more nodes than the greedy algorithm. If better performance were wanted, we would combine our approach with other techniques such as the hybrid OpenMP/MPI implementation or simultaneously performing a large number of FFTs.


بدون دیدگاه