Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020112421 - SCALABLE IMPLEMENTATIONS OF EXACT DISTINCT COUNTS AND MULTIPLE EXACT DISTINCT COUNTS IN DISTRIBUTED QUERY PROCESSING SYSTEMS

Publication Number WO/2020/112421
Publication Date 04.06.2020
International Application No. PCT/US2019/062085
International Filing Date 19.11.2019
IPC
G06F 16/2455 2019.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2455Query execution
G06F 16/27 2019.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
G06F 16/2453 2019.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2453Query optimisation
CPC
G06F 16/24532
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2453Query optimisation
24532of parallel queries
G06F 16/24545
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2453Query optimisation
24534Query rewriting; Transformation
24542Plan optimisation
24545Selectivity estimation or determination
G06F 16/24554
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2455Query execution
24553of query operations
24554Unary operations; Data partitioning operations
G06F 16/24556
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2455Query execution
24553of query operations
24554Unary operations; Data partitioning operations
24556Aggregation; Duplicate elimination
G06F 16/2471
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
24Querying
245Query processing
2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
2471Distributed queries
G06F 16/278
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
16Information retrieval; Database structures therefor; File system structures therefor
20of structured data, e.g. relational data
27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
278Data partitioning, e.g. horizontal or vertical partitioning
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • VISWANADHA, Sreenivasa
Agents
  • MINHAS, Sandip S.
Priority Data
16/205,98430.11.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) SCALABLE IMPLEMENTATIONS OF EXACT DISTINCT COUNTS AND MULTIPLE EXACT DISTINCT COUNTS IN DISTRIBUTED QUERY PROCESSING SYSTEMS
(FR) MISES EN ŒUVRE ÉVOLUTIVES DE COMPTAGES DISTINCTS EXACTS ET DE MULTIPLES COMPTAGES DISTINCTS EXACTS DANS DES SYSTÈMES DE TRAITEMENT DE REQUÊTES DISTRIBUÉES
Abstract
(EN)
Scalable implementations of exact distinct counts and multiple exact distinct counts in distributed query processing systems are implemented via systems and devices. Distinct counts and multiple exact distinct counts for identifiers/values are performed based on keys. For distinct counts, datasets including data fields are sorted by values of fields and divided into balanced partitions in distributed servers. Subsets of fields with the same value are partitioned together. Key presence is determined for subsets on each partition, and the number of instances for the key are aggregated for exact distinct counts of values. For multiple distinct counts, fields of a dataset are combined by un-pivoting field columns. Compound keys are generated for combined fields from field identifiers of the combined fields and values of another field. Totals of unique values of the combined fields are determined for values in the counted field based on the compound keys.
(FR)
Des mises en œuvre évolutives de comptages distincts exacts et de multiples comptages distincts exacts dans des systèmes de traitement de requêtes distribuées sont réalisées au moyen de systèmes et de dispositifs. Des comptages distincts et des multiples comptages distincts exacts pour des identifiants/valeurs sont effectués d'après des clés. Pour les comptages distincts, des ensembles de données comprenant des champs de données sont triés en fonction des valeurs de champs et divisés en partitions équilibrées dans des serveurs distribués. Des sous-ensembles de champs ayant la même valeur sont divisés ensemble. La présence d'une clé est déterminée pour les sous-ensembles sur chaque partition, et le nombre d'instances pour la clé est agrégé pour les comptages distincts exacts de valeurs. Pour les multiples comptages distincts, les champs d'un ensemble de données sont combinés en dépivotant les colonnes de champs. Des clés composées sont générées pour des champs combinés à partir des identifiants des champs combinés et des valeurs d'un autre champ. Les totaux des valeurs uniques des champs combinés sont déterminés pour les valeurs dans le champ compté d’après les clés composées.
Also published as
Latest bibliographic data on file with the International Bureau