دانلود رایگان مقاله پیش بینی اثربخشی الگو مبتنی بر استنتاج استخراج کننده مولفه

عنوان فارسی
پیش بینی اثربخشی الگو مبتنی بر استنتاج استخراج کننده مولفه
عنوان انگلیسی
Predicting the effectiveness of pattern-based entity extractor inference
صفحات مقاله فارسی
0
صفحات مقاله انگلیسی
9
سال انتشار
2016
نشریه
الزویر - Elsevier
فرمت مقاله انگلیسی
PDF
کد محصول
E2187
رشته های مرتبط با این مقاله
مهندسی کامپیوتر
گرایش های مرتبط با این مقاله
برنامه نویسی کامپیوتر
مجله
محاسبات کاربردی نرم - Applied Soft Computing
دانشگاه
دانشگاه تریست، ایتالیا
کلمات کلیدی
معیارهای شباهت رشته، استخراج اطلاعات، برنامه نویسی ژنتیک، برآورد سختی
چکیده

ABSTRACT


An essential component of any workflow leveraging digital data consists in the identification and extraction of relevant patterns from a data stream. We consider a scenario in which an extraction inference engine generates an entity extractor automatically from examples of the desired behavior, which take the form of user-provided annotations of the entities to be extracted from a dataset. We propose a methodology for predicting the accuracy of the extractor that may be inferred from the available examples. We propose several prediction techniques and analyze experimentally our proposals in great depth, with reference to extractors consisting of regular expressions. The results suggest that reliable predictions for tasks of practical complexity may indeed be obtained quickly and without actually generating the entity extractor.

نتیجه گیری

6. Concluding remarks


We have considered a scenario in which an extraction inference engine generates an extractor automatically from user-provided examples of the entities to be extracted from a dataset. We have addressed the problem of predicting the accuracy of the extractor that may be inferred from the available examples, by requiring that the prediction be obtained very quickly w.r.t. the time required for actually inferring the extractor. This problem is highly challenging and we are not aware of any earlier proposal in this respect. With reference to extractors consisting of regular expressions, we have proposed several techniques and analyzed them experimentally in depth. The results suggest that reliable predictions for tasks of practical complexity may indeed be obtained quickly and without actually generating the extractor.


بدون دیدگاه