دانلود رایگان مقاله انگلیسی رویکرد مبتنی بر پرس و جو برای تحلیل میدانی داده های لگاریتمی با استفاده از تکنیک های وب کاوی - اشپرینگر 2018

قیمت خرید این محصول

رایگان

دانلود مقاله انگلیسی سفارش ترجمه این مقاله

عنوان فارسی

رویکرد مبتنی بر پرس و جو برای تحلیل میدانی داده های لگاریتمی با استفاده از تکنیک های وب کاوی برای بهبود هستی شناسی

عنوان انگلیسی

Query based approach for referrer field analysis of log data using web mining techniques for ontology improvement

صفحات مقاله فارسی

صفحات مقاله انگلیسی

سال انتشار

2018

نشریه

اشپرینگر - Springer

فرمت مقاله انگلیسی

PDF

نوع نگارش

مقالات پژوهشی

رفرنس

دارد

کد محصول

E9312

رشته های مرتبط با این مقاله

مهندسی کامپیوتر

گرایش های مرتبط با این مقاله

الگوریتم ها و محاسبات

مجله

مجله بین المللی فناوری اطلاعات - International Journal of Information Technology

دانشگاه

Department of Computer Engineering - Punjabi University - Patiala - India

کلمات کلیدی

وب کاوی، کاربرد وب کاوی، هستی شناسی، فایل لگاریتمی، جلسات کاربر، خوشه بندی، کشف دانش

doi یا شناسه دیجیتال

https://doi.org/10.1007/s41870-017-0063-2

برای سفارش ترجمه این مقاله با کیفیت عالی و در کوتاه ترین زمان ممکن توسط مترجمین مجرب سایت ایران عرضه؛ روی دکمه سبز رنگ کلیک نمایید.

۰.۰ (هنوز امتیازی ثبت نشده است)

چکیده

Abstract

This work presents a new framework as to how web mining is helpful for information retrieval, using ontology and web log files. Ontology plays a major role in the retrieval of semantic data. The researcher has already constructed the string instrument ontology using prote´ge´ 5.0, which helps in refining the web search in music domain. The researcher has proposed a novel approach for ontology management in which the ontology is continuously updated using the knowledge extracted/discovered from the analysis of the log file (specifically the data related to the referrer field) in form of new concepts and new relationships between new and/or existing concepts. The goal of this study is to use data mining algorithms to analyse visitors and visited web pages of the website and somehow characterise or distinguish them in some way. During this the researcher has collected ‘guitar’ web access log from guitar selling website of 363 days of the year 2016. After pre-processing of this log file, two new feature sets have been extracted from ‘guitar’ log file and constructed two files namely ‘File1’ and ‘File 2’. File 2 is also known as query log. Further clustering (EM), association rule finding (Apriori) and sequential patterns (n-gram) algorithms have been applied for suggestions of new concepts to continuously update and improve the existing ontology from time to time.

نتیجه گیری

9 Conclusion and future work

Constructing ontology and its continuous improvement requires knowledge integration and updating it from varied sources, but specifically from web content belonging to a particular domain, in case of Semantic Web. During this study, the researcher has attempted to show the potential impact and use of web usage mining on updating the ontology. The researcher illustrated such an impact in the string instrument ontology in musical domain by considering the site of online guitar selling website maintained by Amar Grifu from France. The researcher has already constructed a new string instrument ontology from base using prote´ge´ 5.0 ontology editor and showed how the knowledge discovered from the analysis of specific type of log file data (referrer filed) of this domain can be immensely useful to update this ontology time to time. To prove this clustering (EM), association rule (Apriori) and sequential pattern (n-gram) mining algorithms in particular have been applied on ‘guitar’ log file of online guitar selling website. The original ‘guitar’ log file contain 24,965 transactions, after cleaning left with 23,626 transactions and 12,334 and 12,760 unique users and sessions, respectively. On this cleaned log file the researcher has applied clustering by grouping pages and visits into 7 and 6 classes, respectively and got some golden nuggets. (1) The percentage of clicks on the pages of English language is 94.79%. (2) The maximum visits are from Europe (30.83% and mostly from France). (3) Maximum downloads are from European visitors (35.54%). (4) Maximum clicks on software and courses pages are from Asia (47.48% and maximum from India). Reasons of these results are discussed earlier in clustering analysis phase.