Search International and National Patent Collections

1. (WO2014022441) CLASSIFICATION OF NUCLEOTIDE SEQUENCES BY LATENT SEMANTIC ANALYSIS

Pub. No.:    WO/2014/022441    International Application No.:    PCT/US2013/052797
Publication Date: Fri Feb 07 00:59:59 CET 2014 International Filing Date: Wed Jul 31 01:59:59 CEST 2013
IPC: G06F 19/00
Applicants: SAYOOD, Khalid
WAY, Sam
NALBANTOGLU, Ozkan Ufuk
GARRITY, George
Inventors: SAYOOD, Khalid
WAY, Sam
NALBANTOGLU, Ozkan Ufuk
GARRITY, George
Title: CLASSIFICATION OF NUCLEOTIDE SEQUENCES BY LATENT SEMANTIC ANALYSIS
Abstract:
DNA sequences are analyzed using latent semantic analysis. A set of nucleotide sequences is received in which the set has a first number of sequences. A set of basis vectors is determined, in which the set has a second number of basis vectors, the second number being smaller than the first number. Each basis vector represents a specific combination of predetermined nucleotide segments. For each of the nucleotide sequences, an approximate representation of the nucleotide sequence is determined based on a combination of the basis vectors. For each pair of nucleotide sequences, a distance between the pair of nucleotide sequences is determined according the distance between the approximate representation of the pair of nucleotide sequences. The set of nucleotide sequences are classified based on the distances between the pairs of nucleotide sequences.