منوی کاربری
  • پشتیبانی: ۴۲۲۷۳۷۸۱ - ۰۴۱
  • سبد خرید

دانلود رایگان مقاله انگلیسی رویکرد مبتنی بر پرس و جو برای تحلیل میدانی داده های لگاریتمی با استفاده از تکنیک های وب کاوی - اشپرینگر 2018

عنوان فارسی
رویکرد مبتنی بر پرس و جو برای تحلیل میدانی داده های لگاریتمی با استفاده از تکنیک های وب کاوی برای بهبود هستی شناسی
عنوان انگلیسی
Query based approach for referrer field analysis of log data using web mining techniques for ontology improvement
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
12
سال انتشار
2018
نشریه
اشپرینگر - Springer
فرمت مقاله انگلیسی
PDF
نوع نگارش
مقالات پژوهشی
رفرنس
دارد
کد محصول
E9312
رشته های مرتبط با این مقاله
مهندسی کامپیوتر
گرایش های مرتبط با این مقاله
الگوریتم ها و محاسبات
مجله
مجله بین المللی فناوری اطلاعات - International Journal of Information Technology
دانشگاه
Department of Computer Engineering - Punjabi University - Patiala - India
کلمات کلیدی
وب کاوی، کاربرد وب کاوی، هستی شناسی، فایل لگاریتمی، جلسات کاربر، خوشه بندی، کشف دانش
doi یا شناسه دیجیتال
https://doi.org/10.1007/s41870-017-0063-2
۰.۰ (بدون امتیاز)
امتیاز دهید
چکیده

Abstract


This work presents a new framework as to how web mining is helpful for information retrieval, using ontology and web log files. Ontology plays a major role in the retrieval of semantic data. The researcher has already constructed the string instrument ontology using prote´ge´ 5.0, which helps in refining the web search in music domain. The researcher has proposed a novel approach for ontology management in which the ontology is continuously updated using the knowledge extracted/discovered from the analysis of the log file (specifically the data related to the referrer field) in form of new concepts and new relationships between new and/or existing concepts. The goal of this study is to use data mining algorithms to analyse visitors and visited web pages of the website and somehow characterise or distinguish them in some way. During this the researcher has collected ‘guitar’ web access log from guitar selling website of 363 days of the year 2016. After pre-processing of this log file, two new feature sets have been extracted from ‘guitar’ log file and constructed two files namely ‘File1’ and ‘File 2’. File 2 is also known as query log. Further clustering (EM), association rule finding (Apriori) and sequential patterns (n-gram) algorithms have been applied for suggestions of new concepts to continuously update and improve the existing ontology from time to time.

نتیجه گیری

9 Conclusion and future work


Constructing ontology and its continuous improvement requires knowledge integration and updating it from varied sources, but specifically from web content belonging to a particular domain, in case of Semantic Web. During this study, the researcher has attempted to show the potential impact and use of web usage mining on updating the ontology. The researcher illustrated such an impact in the string instrument ontology in musical domain by considering the site of online guitar selling website maintained by Amar Grifu from France. The researcher has already constructed a new string instrument ontology from base using prote´ge´ 5.0 ontology editor and showed how the knowledge discovered from the analysis of specific type of log file data (referrer filed) of this domain can be immensely useful to update this ontology time to time. To prove this clustering (EM), association rule (Apriori) and sequential pattern (n-gram) mining algorithms in particular have been applied on ‘guitar’ log file of online guitar selling website. The original ‘guitar’ log file contain 24,965 transactions, after cleaning left with 23,626 transactions and 12,334 and 12,760 unique users and sessions, respectively. On this cleaned log file the researcher has applied clustering by grouping pages and visits into 7 and 6 classes, respectively and got some golden nuggets. (1) The percentage of clicks on the pages of English language is 94.79%. (2) The maximum visits are from Europe (30.83% and mostly from France). (3) Maximum downloads are from European visitors (35.54%). (4) Maximum clicks on software and courses pages are from Asia (47.48% and maximum from India). Reasons of these results are discussed earlier in clustering analysis phase.


بدون دیدگاه