دانلود رایگان مقاله انگلیسی ابزاری برای تحلیل های آماری در شبکه کلان داده ها - IEEE 2017

قیمت خرید این محصول

رایگان

دانلود مقاله انگلیسی سفارش ترجمه این مقاله

عنوان فارسی

ابزاری برای تحلیل های آماری در شبکه کلان داده ها

عنوان انگلیسی

A Tool for Statistical Analysis on Network Big Data

صفحات مقاله فارسی

0

صفحات مقاله انگلیسی

5

سال انتشار

2017

نشریه

آی تریپل ای - IEEE

فرمت مقاله انگلیسی

PDF

کد محصول

E10391

رشته های مرتبط با این مقاله

مهندسی فناوری اطلاعات

گرایش های مرتبط با این مقاله

مدیریت سیستم های اطلاعات

مجله

کارگاه بین المللی کاربرد پایگاه داده ها و سیستم های کارشناس - International Workshop on Database and Expert Systems Applications

دانشگاه

AT&T Labs - Research - USA ∗Research work conducted while visiting AT&T Labs - USA. C. Ordonez current affiliation - University of Houston - USA

doi یا شناسه دیجیتال

https://doi.org/10.1109/DEXA.2017.23

برای سفارش ترجمه این مقاله با کیفیت عالی و در کوتاه ترین زمان ممکن توسط مترجمین مجرب سایت ایران عرضه؛ روی دکمه سبز رنگ کلیک نمایید.

۰.۰ (بدون امتیاز)

امتیاز دهید

چکیده

Abstract

Due to advances in parallel file systems for big data (i.e. HDFS) and larger capacity hardware (multicore CPUs, large RAM) it is now feasible to manage and query network data in a parallel DBMS supporting SQL, but performing statistical analysis remains a challenge. On the statistics side, the R language is popular, but it presents important limitations: R is limited by main memory, R works in a different address space from query processing, R cannot analyze large diskresident data sets efficiently, and R has no data management capabilities. Moreover, some R libraries allow R to work in parallel, but without data management capabilities. Considering the challenges and limitations described above, we present a system that allows combining SQL queries and R functions in a seamless manner. We justify a parallel DBMS and the R runtime are two different systems that benefit from a low-level integration. Our parallel DBMS is built on top of HDFS, programmed in Java and C++, with a flexible scale out architecture, whereas R is programmed purely in C. The user or developer can make calls in both directions: (1) R calling SQL, to evaluate analytic queries or retrieve data from materialized views (transferring result tables in RAM in a streaming fashion and analyzing them in R), and vice-versa (2) SQL calling R, allowing SQL to convert relational tables to matrices or vectors and making complex computations on them. We give a summary of network monitoring tasks at ATT and present specific programming examples, showing language calls in both directions (i.e. R calls SQL, SQL calls R).

نتیجه گیری

CONCLUSIONS

We presented a system that enables fast bi-directional data transfer between a parallel DBMS and the R runtime. In one direction our system converts SQL relational tables into R data frames or matrices. On the opposite direction an R data frame or matrix is converted into a relational table, with a transformed data frame being the most common case. Our system is built on top of a careful mapping between atomic data types. The system efficiently constructs data structures (i.e. non-atomic data types) in RAM in one pass over a data set. The net gain is that an R script can call an SQL query or materialized view to analyze the result set. On the other hand, an SQL query (not a script or longer embedded SQL program) can call an R function to perform some mathematical computation in an intermediate step. Our initial prototype opens several research directions. We want to define functional constructs in the R programming language to transform relational tables into data frames. In a similar manner, we want to study alternatives to transform a matrix into an SQL object (flat table, subscript/value triples, or binary object). Propagating insertions to materialized views and then to a mathematical model computed by R is a challenging problem. Finally, we need to conduct a detailed performance study on the ATT network data warehouse.

برچسب‌ها: دانلود رایگان مقالات انگلیسی مهندسی فناوری اطلاعات IT، دانلود رایگان مقالات isi