Processing

Please wait...

Settings

Settings

1. US20190121849 - WORD REPLACEABILITY THROUGH WORD VECTORS

Office United States of America
Application Number 16165785
Application Date 19.10.2018
Publication Number 20190121849
Publication Date 25.04.2019
Publication Kind A1
IPC
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
F
ELECTRIC DIGITAL DATA PROCESSING
17
Digital computing or data processing equipment or methods, specially adapted for specific functions
20
Handling natural language data
27
Automatic analysis, e.g. parsing, orthograph correction
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
N
COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
99
Subject matter not provided for in other groups of this subclass
G PHYSICS
06
COMPUTING; CALCULATING; COUNTING
F
ELECTRIC DIGITAL DATA PROCESSING
17
Digital computing or data processing equipment or methods, specially adapted for specific functions
30
Information retrieval; Database structures therefor
G06F 17/27
G06N 99/00
G06F 17/30
CPC
G06F 17/277
G06F 17/30979
G06N 99/005
Applicants MachineVantage, Inc.
Inventors Peter Taraba
Ratnakar Dev
Anantha K. Pradeep
Title
(EN) WORD REPLACEABILITY THROUGH WORD VECTORS
Abstract
(EN)

Provided are systems, methods, and devices for providing word replaceability information through word vectors. Within a database system, a text document is received, then processed into a number of sub-sentences or sub-segments. The processing involved delimiting one or more sentences within the text document by one or more punctuation marks. Next, a number of n-gram combinations are generated according to co-appearances of n-grams within the sub-sentences. Distance metrics are determined between the n-gram co-appearances for each n-gram combination. Finally, word replaceability information is provided for one or more words or n-grams within the text document, based on the distance metric.