- مبلغ: ۸۶,۰۰۰ تومان
- مبلغ: ۹۱,۰۰۰ تومان
Patent claim parsing can contribute in many patent-related applications, such as patent search, information extraction, machine translation and summarization. However, patent claim parsing is difficult due to the special structure of patent claims. To overcome this difficulty, the challenges facing the patent claim parsing were first investigated and the peculiarities of claim syntax that obstruct dependency parsing were highlighted. To handle these peculiarities, this study proposes a new two-level parser, in which a conventional parser is imbedded. A patent claim is pre-processed in order to remove peculiarities before passed to the conventional parser. The process is based on a new dependency-based syntax called Independent Claim Segment Dependency Syntax (ICSDS). This two-lever parser has demonstrated promising improvement for patent claim parsing on both effectiveness and efficiency over the conventional parser.
In this study, six claim syntax peculiarities that increase the dif- ficulty of parsing are highlighted. They are (1) claim template, (2) post attribute past participle, (3) parenthetical sentence, (4) complex noun phrase as sentence, (5) recursion, and, (6) coordination. These peculiarities cause long claims. Especially, the last four peculiarities lead to Long Distance Dependencies. A new two-level parser is proposed for patent claim parsing. It is designed to improve the adaptability of a conventional parser, e.g. Stanford parser. The conventional parser (in the first level) is evoked by a higher-level (in the second level) parser, which can handle the peculiarities of claim syntax. With respect to peculiarity (1), a trimming process is adopted to filter non-informative content. With respect to peculiarity (2), a POS correction process is adopted to change past form into past participle. With respect to last four peculiarities, a new dependency syntax called Independent Claim Segment Dependency Syntax (ICSDS) is proposed. To guarantee the efficiency of the proposed parser, a segmentation strategy is adopted. The segmentation and consequent assembly is executed by the ICSDSbased parser. The conventional parser is only evoked when processing each claim segment. Theoretically, the distance (in terms of segments) between two segments is much smaller than the distance (in terms of words) between two words within these two segments, respectively. Thus, the distance of the dependency becomes shorter and is easier to be captured.