دانلود رایگان مقاله انگلیسی رویکردهای هندسی و توپولوژیکی تا کلان داده - الزویر 2017

عنوان فارسی
رویکردهای هندسی و توپولوژیکی تا کلان داده
عنوان انگلیسی
Geometrical and topological approaches to Big Data
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
11
سال انتشار
2017
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
نوع مقاله
ISI
نوع نگارش
مقالات پژوهشی (تحقیقاتی)
رفرنس
دارد
پایگاه
اسکوپوس
کد محصول
E10675
رشته های مرتبط با این مقاله
مهندسی فناوری اطلاعات
گرایش های مرتبط با این مقاله
مدیریت سیستم های اطلاعات
مجله
نسل آینده سیستم های کامپیوتری - Future Generation Computer Systems
دانشگاه
Department of Computer Science - Faculty of Electrical Engineering and Computer Science - Technical University of Ostrava - Czech Republic
کلمات کلیدی
کلان داده، صنعت 4.0، تحلیل داده های توپولوژیکی، هماهنگی پایدار، کاهش ابعاد، تجسم کلان داده
doi یا شناسه دیجیتال
https://doi.org/10.1016/j.future.2016.06.005
چکیده

abstract


Modern data science uses topological methods to find the structural features of data sets before further supervised or unsupervised analysis. Geometry and topology are very natural tools for analysing massive amounts of data since geometry can be regarded as the study of distance functions. Mathematical formalism, which has been developed for incorporating geometric and topological techniques, deals with point cloud data sets, i.e. finite sets of points. It then adapts tools from the various branches of geometry and topology for the study of point cloud data sets. The point clouds are finite samples taken from a geometric object, perhaps with noise. Topology provides a formal language for qualitative mathematics, whereas geometry is mainly quantitative. Thus, in topology, we study the relationships of proximity or nearness, without using distances. A map between topological spaces is called continuous if it preserves the nearness structures. Geometrical and topological methods are tools allowing us to analyse highly complex data. These methods create a summary or compressed representation of all of the data features to help to rapidly uncover particular patterns and relationships in data. The idea of constructing summaries of entire domains of attributes involves understanding the relationship between topological and geometric objects constructed from data using various features. A common thread in various approaches for noise removal, model reduction, feasibility reconstruction, and blind source separation, is to replace the original data with a lower dimensional approximate representation obtained via a matrix or multi-directional array factorization or decomposition. Besides those transformations, a significant challenge of feature summarization or subset selection methods for Big Data will be considered by focusing on scalable feature selection. Lower dimensional approximate representation is used for Big Data visualization. The cross-field between topology and Big Data will bring huge opportunities, as well as challenges, to Big Data communities. This survey aims at bringing together state-of-the-art research results on geometrical and topological methods for Big Data.

نتیجه گیری

Conclusion


The last few years have seen a great increase in the amount of data available to scientists, engineers, and researchers from many disciplines. Modern data science uses topological methods to find the structural features of data sets before further supervised or unsupervised analysis. The size of data at present is huge and continues to increase every day. Data sets with millions of objects and hundreds, if not thousands, of measurements, are now commonplace in areas such as image analysis, computational finance, bio-informatics, and astrophysics. The variety of data being generated is also expanding. The velocity of data generation and its growth is increasing because of the proliferation of IoT, sensors connected to the Internet. This data provides opportunities that allow businesses across all industries to gain real-time business insights. We present motivational examples to show that, for large amounts of data, we need a new model for data processing. This model must be based on feature summarization instead of classical methods based on feature selection. We also face, with uncertainty, Big Data. The geometrical and topological method helps us to solve this problem. In this study, we presented a review of the rise of geometrical and topological methods which can be used for Big Data processing. We proposed a geometrical and topological view of the Big Data model.


بدون دیدگاه