دانلود رایگان مقاله انگلیسی کاوش و تجسم اطلاعات ضد و نقیض - اشپرینگر 2017

قیمت خرید این محصول

رایگان

دانلود مقاله انگلیسی سفارش ترجمه این مقاله

عنوان فارسی

کاوش و تجسم اطلاعات ضد و نقیض

عنوان انگلیسی

Mining and visualising contradictory data

صفحات مقاله فارسی

صفحات مقاله انگلیسی

سال انتشار

2017

نشریه

اشپرینگر - Springer

فرمت مقاله انگلیسی

PDF

نوع مقاله

ISI

نوع نگارش

METHODOLOGY

رفرنس

دارد

پایگاه

اسکوپوس

کد محصول

E10502

رشته های مرتبط با این مقاله

مهندسی صنایع

گرایش های مرتبط با این مقاله

داده کاوی

مجله

کلان داده - Journal of Big Data

دانشگاه

Computer Science Department - University of Nigeria - Abuja Building - Nigeria

کلمات کلیدی

ConTra، مقادیر مجزا از کوما، مجموعه داده، تناقضات، داده های متضاد، مقادیر خروج متقابل

doi یا شناسه دیجیتال

https://doi.org/10.1186/s40537-017-0100-9

برای سفارش ترجمه این مقاله با کیفیت عالی و در کوتاه ترین زمان ممکن توسط مترجمین مجرب سایت ایران عرضه؛ روی دکمه سبز رنگ کلیک نمایید.

۰.۰ (هنوز امتیازی ثبت نشده است)

چکیده

Abstract

Big datasets are often stored in fat fles and can contain contradictory data. Contradictory data undermines the soundness of the information from a noisy dataset. Traditional tools such as pie chart and bar chart are overwhelmed when used to visually identify contradictory data in multidimensional attribute-values of a big dataset. This work explains the importance of identifying contradictions in a noisy dataset. It also examines how contradictory data in a large and noisy dataset can be mined and visually analysed. The authors developed ‘ConTra’, an open source application which applies mutual exclusion rule in identifying contradictory data, existing in comma separated values (CSV) dataset. ConTra’s capability to enable the identifcation of contradictory data in diferent sizes of datasets is examined. The results show that ConTra can process large dataset when hosted in servers with fast processors. It is also shown in this work that ConTra is 100% accurate in identifying contradictory data of objects whose attribute values do not conform to the mutual exclusion rule of a dataset in CSV format. Diferent approaches through which ConTra can mine and identify contradictory data are also presented.

نتیجه گیری

Conclusion and the way forward

Contradictory data can lead to an unsound analysis and eliminating its instances does not enable sound analysis when dealing with a noisy set of data. Tis work has identifed novel approaches for mining and visualising contradictory data which exists in a noisy CSV dataset. It is hoped that future work will examine how objects, attributes and values can be mined from other dataset formats such as text, resource description framework in attributes RDFa and XML. Tis will enable the use of ConTra in visualising contradictory data in such data formats. Also, there is need to combine the mutual exclusion technique (as presented in this work) with other contradictory detection techniques. Tis is because the use of mutual exclusion technique is limited to contradictions which results from allocating conficting values to mutually exclusive attributes. Arbitrary errors such as human errors in tabulating data, or numeric mismatch are some of the examples of contradictory data which ConTra is not designed for. Te authors hope to introduce a newer version of ConTra with improved performance such that it can process tens of GigaByte (GB) of data in a short interval of time. Tey also hope to take advantage of parallel processor programming in enhancing ConTra’s processing speed in its future versions.