Aiming at the fact that the extraction of illegal fact elements in lawsuit cases depends on special professional knowledge, an automatic illegal fact elements extraction of lawsuit cases based on BERT is proposed. First, by constructing domain knowledge and using Google BERT pre-training language model for training, model parameters fitting the domain data of lawsuit cases and embedding vector of Chinese pre-training words are obtained as the input of the model, the contextual representation is obtained to improve the quality of the context semantic of word embedding, and then the text is encoded by the cyclic convolutional neural network and the information that plays a key role in the text classification task is obtained. Finally, focal function is adopted as the loss function to focus on the indistinguishable samples. The work of extracting elements of illegal facts is obtained by classifying text labels. Experimental tests show that the F1 value of the method is 86.41%, which is better than other methods. The accuracy of model extraction can also be improved by injecting domain knowledge into the model.
参考文献
相似文献
引证文献
引用本文
崔斌,邹蕾,徐明月. 基于BERT的诉讼案件违法事实要素自动抽取[J]. 科学技术与工程, 2021, 21(9): 3669-3675. Cui Bin, Zou Lei, Xu Mingyue. Automatic illegal fact extraction of lawsuit case based on BERT[J]. Science Technology and Engineering,2021,21(9):3669-3675.