Search International and National Patent Collections

1. (WO2018102438) LANGUAGE IDENTIFICATION FOR TEXT STRINGS

Pub. No.:    WO/2018/102438    International Application No.:    PCT/US2017/063753
Publication Date: Fri Jun 08 01:59:59 CEST 2018 International Filing Date: Thu Nov 30 00:59:59 CET 2017
IPC: G06F 17/20
Applicants: EBAY INC.
Inventors: GUPTA, Akshay
JOSHI, Hrishikesh
KOHLI, Saiyam
AGGARWAL, Vidit
Title: LANGUAGE IDENTIFICATION FOR TEXT STRINGS
Abstract:
Aspects of the present disclosure include a system comprising a machine-readable storage medium storing at least one program and computer-implemented methods for detecting a language of a text string. Consistent with some embodiments, the method may include applying multiple language identification models to a text string. Each language identification model provides a predicted language of the text string and a confidence score associated with the predicted language. The method may further include weighting each associated confidence score based on historical performance of the corresponding language identification model in predicting languages of other text strings. The method may further include selecting a predicted language of the text string from among the multiple predicted languages provided by the multiple language identification models based on a result of the weighting of the confidence score associated with the particular predicted language.