Search International and National Patent Collections

1. (WO2018218705) METHOD FOR RECOGNIZING NETWORK TEXT NAMED ENTITY BASED ON NEURAL NETWORK PROBABILITY DISAMBIGUATION

Pub. No.:    WO/2018/218705    International Application No.:    PCT/CN2017/089135
Publication Date: Fri Dec 07 00:59:59 CET 2018 International Filing Date: Wed Jun 21 01:59:59 CEST 2017
IPC: G06F 17/27
Applicants: CHINA UNIVERSITY OF MINING AND TECHNOLOGY
中国矿业大学
Inventors: ZHOU, Yong
周勇
LIU, Bing
刘兵
HAN, Zhaoyu
韩兆宇
WANG, Zhongqiu
王重秋
Title: METHOD FOR RECOGNIZING NETWORK TEXT NAMED ENTITY BASED ON NEURAL NETWORK PROBABILITY DISAMBIGUATION
Abstract:
A method for recognizing network text named entity based on neural network probability disambiguation. The method comprises: carrying out word segmentation on an unlabeled corpus, using Word2Vec to extract a word vector; converting a sample corpus into a word feature matrix and windowing same; building a deep neural network to carry out training, and adding a softmax function into an output layer of the neural network to carry out normalization processing, so as to obtain a probability matrix of the named entity category corresponding to each word; and re-windowing the probability matrix, and using a conditional random field model to carry out disambiguation, so as to obtain a final named entity annotation. In a named entity recognition task of network text, a word vector increment learning method without changing the structure of a neural network is provided, according to the characteristic that a network vocabulary and a new vocabulary exist therein, and a probability disambiguation method is used in order to deal with the problems of a nonstandard grammatical structure and many wrongly written characters in the network text. Therefore, higher accuracy can be produced.