Certains contenus de cette application ne sont pas disponibles pour le moment.
Si cette situation persiste, veuillez nous contacter àObservations et contact
1. (WO2014021824) PROCÉDÉ DE RECHERCHE
Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

CLAIMS

1. A method of generating search results from a data set, the method comprising:

obtaining first search results based on a first query, the search results comprising a plurality of documents;

assigning a weight value to one or more documents of the first search results;

calculating a correlation of terms present in the one or more documents of the search results based at least in part on the assigned weight value; and

obtaining second search results based on a second query, wherein the second query comprises one or more terms having a highest calculated correlation.

2. The method of claim 1 , wherein obtaining the first and second search results comprises obtaining the first and second search results from a remote search engine.

3. The method of claim 1 or claim 2, further comprising assigning a weight value to one or more documents of the second search results, and ranking the second search results based on the assigned weight values.

4. The method of any preceding claim, wherein the first search query comprises one or more search query terms provided by a user.

5. The method of any preceding claim, wherein the first search query comprises personal details of a user initiating the search.

6. The method of any preceding claim, wherein assigning a weight value to one or more documents of the search results further comprises assigning a weight value based on one or more of: a number of search-terms of the query present in the document; a frequency of search-terms present in the document compared to a frequency of search terms in the data set; a position of the each search-term in the document; and an author of the document.

7. The method of any preceding claim further comprising estimating a frequency of each of a plurality of terms in the data set.

8. The method of claim 7, wherein estimating a frequency of each of a plurality of terms in the data set further comprises:

obtaining a first portion of the data set, the portion comprising a plurality of documents;

determining an inverse document frequency (IDF) for each of the plurality of terms in the first portion of the data set; and

estimating an inverse document frequency for each term in the data set based on the determined IDF for each term in the first portion of the data set,

9. The method of claim 8, further comprising:

after a predetermined interval, obtaining a further portion of the data set, the further portion comprising a plurality of documents including at least some documents not present in the first portion of the data set;

determining an inverse document frequency (IDF) for each of the plurality of terms in the further portion of the data set; and

estimating an inverse document frequency for each term in the data set

based the previously estimated IDF and on the determined IDF for each term in the further portion of the data set.

10. The method of claim 9, further comprising determining a length of the predetermined interval based on an update rate of the data set.

11. The method of any preceding claim further comprising identifying a portion of the first search results having the highest assigned weight values to generate first filtered search results, wherein said calculating a correlation of terms is performed for documents of the first filtered search results.

12. A system comprising:

a processor; and

a memory comprising instructions configured when executed on the processor to cause the system to:

obtain first search results based on a first query, the search results comprising a plurality of documents;

assign a weight value to one or more documents of the first search results; calculate a correlation of terms present in the one or more documents of the search results based at least in part on the assigned weight value; and

obtain second search results based on a second query, wherein the second query comprises one or more terms present in the one or more documents having a highest calculated correlation.

13. The system of claim 12, further comprising a network interface and wherein the instructions are further configured when executed on the processor to cause the system to obtain the first and second search results via the network interface.

14. The system of claim 12 or claim 13, further comprising a network interface and wherein the instructions are further configured when executed on the processor to cause the system to assign a weight value to one or more documents of the second search results, and ranking the second search results based on the assigned weight values.

15. A computer program product comprising computer program code adapted, when executed on a processor, to perform the steps of any of claim 1 to 11.