دانلود رایگان مقاله بهینه سازی هسته برای دینامیک مولکولی کوتاه برد

عنوان فارسی
بهینه سازی هسته برای دینامیک مولکولی کوتاه برد
عنوان انگلیسی
Kernel optimization for short-range molecular dynamics
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
10
سال انتشار
2016
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
کد محصول
E997
رشته های مرتبط با این مقاله
فیزیک
گرایش های مرتبط با این مقاله
فیزیک کاربردی
مجله
ارتباطات کامپیوتر و فیزیک - Computer Physics Communications
دانشگاه
دانشگاه علوم و فناوری پکن، پکن، چین
کلمات کلیدی
دینامیک مولکولی ،IMD ،MIC
چکیده

Abstract


To optimize short-range force computations in Molecular Dynamics (MD) simulations, multi-threading and SIMD optimizations are presented in this paper. With respect to multi-threading optimization, a Partition-and-Separate-Calculation (PSC) method is designed to avoid write conflicts caused by using Newton’s third law. Serial bottlenecks are eliminated with no additional memory usage. The method is implemented by using the OpenMP model. Furthermore, the PSC method is employed on Intel Xeon Phi coprocessors in both native and offload models. We also evaluate the performance of the PSC method under different thread affinities on the MIC architecture. In the SIMD execution, we explain the performance influence in the PSC method, considering the “if-clause” of the cutoff radius check. The experiment results show that our PSC method is relatively more efficient compared to some traditional methods. In double precision, our 256-bit SIMD implementation is about 3 times faster than the scalar version.

نتیجه گیری

5. Conclusions


We provide multi-threading and vectorization optimization of the MD force calculation kernel. Our optimization strategies accelerate the original MD version. Thus, in the same experimental platform and limited time, a longer physical process can be simulated using our optimized version. We put forward a PSC method to avoid write conflicts when short-range force is calculated on shared-memory multi-core platforms. Our PSC method brings about neither extra memory usage, redundant computation, nor severe serial bottlenecks with increasing threads. Using both native and offload models, we utilize the PSC method on the MIC coprocessor.In the offload version, we offload the force calculation part to the coprocessor. We use nocopy and appropriate alloc_if as well as free_if() to reduce transfer time. Our experiment results demonstrate that the PSC method is scalable and efficient using up to 240 threads. The cut-off radius ‘‘if clause’’ in short-range force calculation has a great influence on the MD package performance. We modify the lattice neighbor list in Crystal MD by adding a pre-searching procedure. The modified strategy leads to about 70% of atoms meeting the cutoff check, which decreases numerous redundant calculations. The optimized vectorization version is about 3 times faster than the scalar one. Future research directions include auto tuning and data partitioning. Although EAM potential was used as an example, our optimization strategies are widely applicable for other short-range potentials. Both our multi-threading and vectorization optimizing methods are effective and straightforward to implement.


بدون دیدگاه