دانلود رایگان مقاله طبقه بندی نیمه نظارتی افزایشی جریان های داده ها

عنوان فارسی
طبقه بندی نیمه نظارتی افزایشی جریان های داده ها از طریق خود عامل انتخابی
عنوان انگلیسی
Incremental Semi-Supervised classification of data streams via self-representative selection
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
6
سال انتشار
2016
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
کد محصول
ٍE307
رشته های مرتبط با این مقاله
مهندسی کامپیوتر
گرایش های مرتبط با این مقاله
مهندسی نرم افزار و هوش مصنوعی
مجله
محاسبات نرم کاربردی - Applied Soft Computing
دانشگاه
مرکز تحقیقاتی و محاسباتی ادراک هوشمند، شیان، چین
کلمات کلیدی
یادگیری افزایشی، طبقه بندی نیمه نظارت، خود عامل انتخابی، جریان داده ها، داده های بزرگ
چکیده

Abstract


Incremental learning has been developed for supervised classification, where knowledge is accumulated incrementally and represented in the learning process. However, labeling sufficient samples in each data chunk is of high cost, and incremental technologies are seldom discussed in the semi-supervised paradigm. In this paper we advance an Incremental Semi-Supervised classification approach via Self-Representative Selection (IS3RS) for data streams classification, by exploring both the labeled and unlabeled dynamic samples. An incremental self-representative data selection strategy is proposed to find the most representative exemplars from the sequential data chunk. These exemplars are incrementally labeled to expand the training set, and accumulate knowledge over time to benefit future prediction. Extensive experimental evaluations on some benchmarks have demonstrated the effectiveness of the proposed framework.

نتیجه گیری

4. Conclusion


In this paper, we proposed a new incremental semi-supervised learning framework via representation learning for stream data classification. The key idea of this new algorithm is to improve the classification performance based on the information incrementally learned from the testing data. Representative learning is used to obtain informative exemplars of the stream data, and co-training technique is used to label the exemplars. We investigate the effectiveness of the proposed algorithm on some benchmark datasets, and compare it with some state-of-the-art results on incremental learning. The results show that our method can find informative exemplars to enlarge the training set and gradually find new classes. Moreover, our method can achieve higher classification results than its counterparts. The proposed algorithm has potential business applications in stock forecasting and other data mining tasks. So it can be embedded into a business forecasting software to deal with large scale data streams or “big” dataset. Future work will be taken on an extension of our method to a distributed version and a realization on a parallel computing platform.


بدون دیدگاه