Search International and National Patent Collections

1. (WO2018126325) LEARNING DOCUMENT EMBEDDINGS WITH CONVOLUTIONAL NEURAL NETWORK ARCHITECTURES

Pub. No.:    WO/2018/126325    International Application No.:    PCT/CA2018/050012
Publication Date: Fri Jul 13 01:59:59 CEST 2018 International Filing Date: Sat Jan 06 00:59:59 CET 2018
IPC: G06N 3/02
G06F 17/20
G06N 3/04
G06N 3/08
Applicants: THE TORONTO-DOMINION BANK
Inventors: VOLKOVS, Maksims
POUTANEN, Tomi Johan
Title: LEARNING DOCUMENT EMBEDDINGS WITH CONVOLUTIONAL NEURAL NETWORK ARCHITECTURES
Abstract:
A document analysis system trains a document embedding model configured to receive a set of word embeddings for an ordered set of words in a document and generate a document embedding for the document. The document embedding is a representation of the document in a latent space that characterizes the document with respect to properties such as structure, content, and sentiment. The document embedding may represent a prediction of a set of words that follow the last word in the ordered set of words of the document. The document embedding model may be associated with a convolutional neural network( CNN) architecture that includes one or more convolutional layers. The CNN architecture of the document embedding model allows the document analysis system to overcome various difficulties of existing document embedding models, and allows the document analysis system to easily process variable-length documents that include a variable number of words.