Traitement en cours

Veuillez attendre...

PATENTSCOPE sera indisponible durant quelques heures pour des raisons de maintenance le mardi 26.10.2021 à 12:00 PM CEST
Paramétrages

Paramétrages

Aller à Demande

1. US20200243077 - UNSUPERVISED KEYWORD SPOTTING AND WORD DISCOVERY FOR FRAUD ANALYTICS

Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

[ EN ]

Claims

1. A computer-implemented method comprising:
generating, by a computer, a plurality of audio frames from a plurality of audio signals;
clustering, by the computer, one or more features of each audio frame according to a modeling algorithm, thereby generating one or more models for each frame;
extracting, by the computer, posterior probabilities for each of the one or more features of extracted from the audio frames according to the one or more models;
receiving, by the computer, from a client computer a keyword indicator for a keyword to query in the audio signals, the keyword comprising one or more words;
receiving, by the computer, from the client computer a named entity indicator for a named entity to be redacted from the query, wherein the computer nullifies the posterior probability of each frame containing the named entity;
calculating, by the computer, for each audio frame containing the keyword, a first similarity score and a second similarity score, the first similarity score and the second similarity score of an audio frame calculated using a model selected for the respective frame based on the posterior probability of the audio frame;
storing, by the computer, into a queue, a subset of audio frames having a second similarity score comparatively higher than a corresponding first similarity score, the subset containing a review-threshold amount of audio frames; and
generating, by the computer, a list of audio segments of the audio signals matching the keyword, the list of audio segments containing at least one of the audio frames in the subset.
2. The method of claim 1, wherein the first similarity score is a lower-bound dynamic time warping score calculated by the computer using a lower-bound dynamic time-warping algorithm.
3. The method of claim 1, wherein the second similarity score is a segmental dynamic time warping score calculated by the computer using a segmental dynamic time-warping algorithm.
4. The method of claim 1, wherein the keyword indicator received from the client computer includes one or more timestamps indicating to the computer when instances of the keyword occur in at least one audio signal.
5. The method of claim 1, wherein the named entity indicator received from the client computer includes one or more timestamps indicating to the computer when instances of the named entity occur in at least one audio signal.
6. The method of claim 1, further comprising receiving, by the computer, from the client computer a review-threshold indicator indicating the review-threshold amount of audio frames in the subset.
7. The method of claim 1, further comprising transmitting, by the computer, to the client computer the list of audio segments matching the keyword, the list of audio segments containing the review-threshold amount of audio segments.
8. The method of claim 1, further comprising identifying, by the computer, for each segment in the list, one or more timestamps indicating when instances of the keyword occur in the segment, wherein the list transmitted to the client computer includes each timestamp associated with the one or more segments of the list.
9. The method of claim 1, further comprising generating, by the computer, one or more segments for each of the audio signals, wherein each segment comprises at least one frame.
10. The method of claim 9, wherein the one or more segments of each audio signal are generated according to a voice-activated detection module configured to detect a segment.
11. A computer-implemented method comprising:
segmenting, by a computer, a first audio signal into a first set of one or more audio segments, and a second audio signal into a second set of one or more audio segments;
generating, by the computer, sets of one or more paths for each audio segment in the first set of audio segments, and sets of one or more paths for each audio segment in the second set of audio segments;
calculating, by the computer, based on lower-bound dynamic time-warping algorithm, a similarity score for each path of each audio segment of the first set of audio segments, and for each path of each audio segment of the second set of audio segments; and
identifying, by the computer, at least one similar acoustic region between the first set of segments and the second set of audio segments, based upon comparing the similarity scores of each path of each segment of the first set of audio segments against the similarity scores of each path of each segment of the second set of audio segments.
12. The method of claim 11, wherein each path is a fixed-length portion of an audio segment.
13. The method of claim 11, further comprising:
clustering, by the computer, one or more features of each path of each segment in a similar acoustic region according to a modeling algorithm, thereby generating one or more models for each path; and
extracting, by the computer, posterior probabilities for each of the one or more features of extracted from the audio paths according to the one or more models, wherein the similarity score for each respective path is calculated using a model selected for the respective path based on the posterior probability of the respective path.
14. The method of claim 13, further comprising receiving, by the computer, from a client computer a named entity indicator indicating to the computer an instance of the named entity in at least one audio signal, wherein the computer nullifies the posterior probability of each path containing the named entity for clustering.
15. The method of claim 11, wherein comparing the similarity scores further comprises:
selecting, by the computer, from the second set of segments a first test segment at a first time index and defined by a first time window;
comparing, by the computer, the similarity scores for the paths of the first test segment against the similarity scores for the paths of at least one query segment of the first set of segments, according to the first time window and the first time index;
selecting, by the computer, from the second set of segments a second test segment at a second time index and defined by a second time window; and
comparing, by the computer, the similarity scores for the paths of the second test segment against the similarity scores for the paths of the at least one query segment, according to the second time window and the second time index.
16. The method of claim 11, wherein identifying a similar acoustic region further comprises:
identifying, by the computer, a first-level match between a query segment of the first set of audio segments and a test segment of the second set of audio segments, based on determining that a minimum distance value between the similarity scores for the paths of the query segment and the similarity scores for the paths of the test segment satisfies a first-level threshold.
17. The method of claim 16, further comprising identifying, by the computer, a second-level match between the query segment of the first set of audio segments and the test segment of the second set of audio segments, based on determining that a number of first-level matches satisfies a second-level threshold.
18. The method of claim 17, further comprising:
identifying, by the computer, one or more pairwise matches in the first set of audio segments and the second set of audio segments, based on identifying each second-level match between the first set of audio segments and the second set of audio segments,
wherein the similar acoustic region is defined by at least one time index and at least one pairwise match at the at least one time index, and
wherein the computer includes each of the pairwise matches of the similar acoustic region in the clustering.
19. The method of claim 18, further comprising storing, by the computer, into a database, matched data comprising each of the pairwise matches for each similar acoustic region identified by the computer and time indexes corresponding to each of the pairwise matches.
20. The method of claim 16, further comprising:
determining, by the computer, that a number of first-level matches between the query segment and the test segment fails to satisfy a second-level threshold; and
selecting, by the computer, a next test segment of the set of second audio segments to compare against the query segment.