دانلود رایگان مقاله انگلیسی Matminer: مجموعه ابزار منبع باز برای مواد داده کاوی - الزویر 2018

عنوان فارسی
Matminer: مجموعه ابزار منبع باز برای مواد داده کاوی
عنوان انگلیسی
Matminer: An open source toolkit for materials data mining
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
10
سال انتشار
2018
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
نوع مقاله
ISI
نوع نگارش
مقالات پژوهشی (تحقیقاتی)
رفرنس
دارد
پایگاه
اسکوپوس
کد محصول
E9988
رشته های مرتبط با این مقاله
مهندسی صنایع، مهندسی کامپیوتر
گرایش های مرتبط با این مقاله
داده کاوی، هوش مصنوعی
مجله
علوم مواد محاسباتی - Computational Materials Science
دانشگاه
Computation Institute - University of Chicago - Chicago - United States
کلمات کلیدی
داده کاوی، نرم افزار منبع باز، یادگیری ماشین، انفورماتیک مواد
doi یا شناسه دیجیتال
https://doi.org/10.1016/j.commatsci.2018.05.018
چکیده

ABSTRACT


As materials data sets grow in size and scope, the role of data mining and statistical learning methods to analyze these materials data sets and build predictive models is becoming more important. This manuscript introduces matminer, an open-source, Python-based software platform to facilitate data-driven methods of analyzing and predicting materials properties. Matminer provides modules for retrieving large data sets from external databases such as the Materials Project, Citrination, Materials Data Facility, and Materials Platform for Data Science. It also provides implementations for an extensive library of feature extraction routines developed by the materials community, with 47 featurization classes that can generate thousands of individual descriptors and combine them into mathematical functions. Finally, matminer provides a visualization module for producing interactive, shareable plots. These functions are designed in a way that integrates closely with machine learning and data analysis packages already developed and in use by the Python data science community. We explain the structure and logic of matminer, provide a description of its various modules, and showcase several examples of how matminer can be used to collect data, reproduce data mining studies reported in the literature, and test new methodologies.

نتیجه گیری

Conclusion


Performing materials informatics requires developing a data pipeline that encompasses data retrieval, feature extraction, and visualization prior to the actual machine learning step. The matminer software described in this manuscript is designed to facilitate the development, reuse, and reproducibility of data pipelines for materials informatics applications. We have designed matminer to connect the domain-specific aspects of materials informatics (i.e., materials data extraction, feature extraction of materials science concepts, common plotting routines) with the professional level machine learning and data processing software already developed and in use by the Python community. It is our hope that matminer can serve as a community repository for new materials data analytics techniques as they become available such that researchers can rapidly develop and test new methods against standard techniques, accelerating the use of data mining in the materials community at large.


بدون دیدگاه