Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020092276 - VIDEO RECOGNITION USING MULTIPLE MODALITIES

Publication Number WO/2020/092276
Publication Date 07.05.2020
International Application No. PCT/US2019/058413
International Filing Date 29.10.2019
IPC
G06K 9/00 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
G06K 9/46 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
46Extraction of features or characteristics of the image
G06K 9/62 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62Methods or arrangements for recognition using electronic means
CPC
G06K 9/00335
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
00335Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
G06K 9/00718
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
00711Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
00718Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
G06K 9/00744
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
00711Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
00744Extracting features from the video content, e.g. video "fingerprints", or characteristics, e.g. by automatic extraction of representative shots or key frames
G06K 9/4628
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
46Extraction of features or characteristics of the image
4604Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes, intersections
4609by matching or filtering
4619Biologically-inspired filters, e.g. receptive fields
4623with interaction between the responses of different filters
4628Integrating the filters into a hierarchical structure
G06K 9/6271
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62Methods or arrangements for recognition using electronic means
6267Classification techniques
6268relating to the classification paradigm, e.g. parametric or non-parametric approaches
627based on distances between the pattern to be recognised and training or reference patterns
6271based on distances to prototypes
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • VAEZI JOZE, Hamidreza
  • ABAVISANI, Mahdi
Agents
  • MINHAS, Sandip S.
  • ADJEMIAN, Monica
  • BARKER, Doug
  • CHATTERJEE, Aaron C.
  • CHEN, Wei-Chen Nicholas
  • CHOI, Daniel
  • CHURNA, Timothy
  • DINH, Phong
  • EVANS, Patrick
  • GABRYJELSKI, Henry
  • GOLDSMITH, Micah P.
  • GUPTA, Anand
  • HINOJOSA-SMITH, Brianna L.
  • HWANG, William C.
  • JARDINE, John S.
  • LEE, Sunah
  • LEMMON, Marcus
  • MARQUIS, Thomas
  • MEYERS, Jessica
  • ROPER, Brandon
  • SPELLMAN, Steven
  • SULLIVAN, Kevin
  • SWAIN, Cassandra T.
  • TABOR, Ben
  • WALKER, Matt
  • WIGHT, Stephen A.
  • WISDOM, Gregg
  • WONG, Ellen
  • WONG, Thomas S.
  • ZHANG, Hannah
  • TRAN, Kimberly
Priority Data
16/287,11327.02.2019US
62/754,36001.11.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) VIDEO RECOGNITION USING MULTIPLE MODALITIES
(FR) RECONNAISSANCE VIDÉO À L'AIDE DE MODALITÉS MULTIPLES
Abstract
(EN)
Implementations described herein discloses a multi-modality video recognition system. Specifically, the multi-modality video recognition system is configured to train a plurality of classifier networks, each of the classifier network trained with a different one of the plurality of video streams, wherein each of the plurality of different classifier networks includes multiple intermediate layers, determine correlation matrices of related intermediate layers of each of the plurality of the different classifier networks, and align the correlation matrices of the related intermediate layers of each of the plurality of the different classifier networks.
(FR)
Des modes de réalisation de l'invention concernent un système de reconnaissance vidéo à modalités multiples. En particulier, le système de reconnaissance vidéo à modalités multiples est configuré pour : apprendre une pluralité de réseaux de classificateurs, chaque réseau de classificateurs étant appris avec un flux différent de la pluralité de flux vidéo, chaque réseau de la pluralité de différents réseaux de classifieurs comprenant de multiples couches intermédiaires ; déterminer les matrices de corrélation des couches intermédiaires associées de chaque réseau de la pluralité des différents réseaux de classificateurs ; et aligner les matrices de corrélation des couches intermédiaires associées de chaque classificateur de la pluralité des différents réseaux de classificateurs.
Latest bibliographic data on file with the International Bureau