Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020183428 - METHOD AND SYSTEM FOR MAPPING READ SEQUENCES USING A PANGENOME REFERENCE

Publication Number WO/2020/183428
Publication Date 17.09.2020
International Application No. PCT/IB2020/052294
International Filing Date 13.03.2020
IPC
G06F 17/18 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
17Digital computing or data processing equipment or methods, specially adapted for specific functions
10Complex mathematical operations
18for evaluating statistical data
G06F 17/10 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
17Digital computing or data processing equipment or methods, specially adapted for specific functions
10Complex mathematical operations
G01N 33/48 2006.01
GPHYSICS
01MEASURING; TESTING
NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
33Investigating or analysing materials by specific methods not covered by groups G01N1/-G01N31/131
48Biological material, e.g. blood, urine; Haemocytometers
G01N 33/50 2006.01
GPHYSICS
01MEASURING; TESTING
NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
33Investigating or analysing materials by specific methods not covered by groups G01N1/-G01N31/131
48Biological material, e.g. blood, urine; Haemocytometers
50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
G16B 45/00 2019.01
GPHYSICS
16INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
45ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
CPC
G16B 30/10
GPHYSICS
16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
30ICT specially adapted for sequence analysis involving nucleotides or amino acids
10Sequence alignment; Homology search
G16B 40/30
GPHYSICS
16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
40ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
30Unsupervised data analysis
Applicants
  • TATA CONSULTANCY SERVICES LIMITED [IN]/[IN]
Inventors
  • VADDADI, Kavya Naga Sai
  • SIVADASAN, Naveen
  • SRINIVASAN, Rajgopal
Agents
  • KHAITAN & CO
Priority Data
20192101001214.03.2019IN
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) METHOD AND SYSTEM FOR MAPPING READ SEQUENCES USING A PANGENOME REFERENCE
(FR) PROCÉDÉ ET SYSTÈME DE MAPPAGE DE SÉQUENCES DE LECTURE À L'AIDE D'UNE RÉFÉRENCE DE PANGÉNOME
Abstract
(EN)
There is a demand for low-cost efficient robust method for mapping read sequences with genome variation graph in genomic study. This disclosure herein relates to a method and system for mapping read sequences with genome variation graph by constructing a subgraph using a novel combination of graph embedding and graph winnowing techniques. The system processes the obtained plurality of read sequences and a genome variation graph for constructing the subgraph by computing an embedding for the genome variation graph utilizing a graph embedding technique. Further, graph index is generated for the genome variation graph based on the embedding and the genome variation graph utilizing the graph winnowing technique. Then computes gapped alignment score for read sequence (r) with its corresponding subgraph. Thus, enables a reliable method for read sequence with accurate, memory efficient and scalable system for mapping read sequences with genome variation graph.
(FR)
Il existe une demande pour un procédé robuste, efficace et économique pour mapper des séquences de lecture avec un graphique de variations de génomes dans une étude génomique. La présente invention concerne un procédé et un système de mappage de séquences de lecture avec un graphique de variations de génomes par construction d'un sous-graphique à l'aide d'une nouvelle combinaison de techniques d'intégration de graphiques et de sélection de type winnowing de graphiques. Le système traite la pluralité de séquences de lecture obtenue et un graphique de variations de génomes pour construire le sous-graphique par calcul d'une intégration pour le graphique de variations de génomes à l'aide d'une technique d'intégration de graphique. En outre, un indice de graphique est généré pour le graphique de variations de génomes sur la base de l'intégration et du graphique de variations de génomes à l'aide de la technique de sélection de type winnowing de graphiques. Puis il calcule un score d'alignement espacé pour la séquence de lecture (r) avec son sous-graphique correspondant. Ainsi, il est possible d'obtenir un procédé fiable pour une séquence de lecture avec un système précis, efficace en mémoire et évolutif pour mapper des séquences de lecture avec un graphique de variations de génomes.
Latest bibliographic data on file with the International Bureau