Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020139982 - DETERMINING PHONETIC SIMILARITY USING MACHINE LEARNING

Publication Number WO/2020/139982
Publication Date 02.07.2020
International Application No. PCT/US2019/068625
International Filing Date 26.12.2019
IPC
G06F 40/44 2020.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
42Data-driven translation
44Statistical methods, e.g. probability models
G06F 40/45 2020.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
42Data-driven translation
45Example-based machine translation; Alignment
G06F 40/47 2020.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
42Data-driven translation
47Machine-assisted translation, e.g. using translation memory
G06F 40/51 2020.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
51Translation evaluation
G06F 40/58 2020.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
CPC
G06F 40/129
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
10Text processing
12Use of codes for handling textual entities
126Character encoding
129Handling non-Latin characters, e.g. kana-to-kanji conversion
G06F 40/263
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
20Natural language analysis
263Language identification
G06F 40/53
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
40Handling natural language data
40Processing or translation of natural language
53Processing of non-Latin text
G06K 9/6215
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62Methods or arrangements for recognition using electronic means
6201Matching; Proximity measures
6215Proximity measures, i.e. similarity or distance measures
G06N 20/00
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
20Machine learning
G09B 19/06
GPHYSICS
09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
19Teaching not covered by other main groups of this subclass
06Foreign languages
Applicants
  • PAYPAL, INC. [US]/[US]
Inventors
  • UPADHYAY, Rushik
  • LAKSHMIPATHY, Dhamodharan
  • RAMESH, Nandhini
  • KAULAGI, Aditya
Agents
  • CHEN, Tom
Priority Data
16/232,61926.12.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) DETERMINING PHONETIC SIMILARITY USING MACHINE LEARNING
(FR) DÉTERMINATION DE LA SIMILARITÉ PHONÉTIQUE EN UTILISANT L'APPRENTISSAGE AUTOMATIQUE
Abstract
(EN)
Techniques are disclosed relating to determining phonetic similarity using machine learning. The techniques include accessing training data that includes a first set of words of a native language and a second set of words corresponding to verified transliterations of the first set of words from the native language to a target language. Further, they include generating a set of new transliterations of the first set of words from the native language to the target language and storing comparison information based on a comparison between words from the second set of words and word from the set of new transliterations of the first set of words. Finally, a similarity score is determined between a first word of the target language and a second word of the target language based on the comparison information.
(FR)
L'invention concerne des techniques relatives à la détermination d'une similarité phonétique en utilisant l'apprentissage automatique. Les techniques comprennent l'accès à des données d'entraînement qui comprennent un premier ensemble de mots d'une langue maternelle et un deuxième ensemble de mots correspondant à des translittérations vérifiées du premier ensemble de mots de la langue maternelle vers une langue cible. Elles comprennent en outre la génération d'un ensemble de nouvelles translittérations du premier ensemble de mots à partir de la langue maternelle vers la langue cible et le stockage d'informations de comparaison sur la base d'une comparaison entre des mots du deuxième ensemble de mots et le mot issu de l'ensemble de nouvelles translittérations du premier ensemble de mots. Enfin, un score de similarité est déterminé entre un premier mot de la langue cible et un deuxième mot de la langue cible sur la base des informations de comparaison.
Also published as
Latest bibliographic data on file with the International Bureau