Search International and National Patent Collections
|1. (WO2018218705) METHOD FOR RECOGNIZING NETWORK TEXT NAMED ENTITY BASED ON NEURAL NETWORK PROBABILITY DISAMBIGUATION|
|Applicants:||CHINA UNIVERSITY OF MINING AND TECHNOLOGY
|Title:||METHOD FOR RECOGNIZING NETWORK TEXT NAMED ENTITY BASED ON NEURAL NETWORK PROBABILITY DISAMBIGUATION|
A method for recognizing network text named entity based on neural network probability disambiguation. The method comprises: carrying out word segmentation on an unlabeled corpus, using Word2Vec to extract a word vector; converting a sample corpus into a word feature matrix and windowing same; building a deep neural network to carry out training, and adding a softmax function into an output layer of the neural network to carry out normalization processing, so as to obtain a probability matrix of the named entity category corresponding to each word; and re-windowing the probability matrix, and using a conditional random field model to carry out disambiguation, so as to obtain a final named entity annotation. In a named entity recognition task of network text, a word vector increment learning method without changing the structure of a neural network is provided, according to the characteristic that a network vocabulary and a new vocabulary exist therein, and a probability disambiguation method is used in order to deal with the problems of a nonstandard grammatical structure and many wrongly written characters in the network text. Therefore, higher accuracy can be produced.