Search International and National Patent Collections
Some content of this application is unavailable at the moment.
If this situation persist, please contact us atFeedback&Contact
1. (WO2019089389) SYSTEMS AND METHODS FOR PRIORITIZING SOFTWARE VULNERABILITIES FOR PATCHING
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

CLAIMS

What is claimed is:

1 . A method for assessing a likelihood of exploitation of software

vulnerabilities, comprising:

utilizing a processor in operable communication with at least one memory for storing instructions that are executed the processor to perform operations of:

accessing a plurality of datasets associated with a predetermined set of data sources;

accessing training data comprising features and class labels associated with the features from the plurality of datasets;

applying learning algorithms to the training data to generate classification models that are configured to predict class labels defining a likelihood of exploitation of respective software vulnerabilities;

accessing one or more features associated with a software vulnerability; and

computing, by applying the one or more features to the classification model, a class label defining one or more values defining a likelihood of exploitation associated with the software

vulnerability.

2. The method of claim 1 , further comprising generating a plurality of estimation outputs based on the one or more values to derive an overall quantitative score.

3. The method of claim 1 , wherein the plurality of datasets include vulnerability data for vulnerabilities that are publically disclosed and obtaining exploits data for exploits that were used in real world attacks.

4. The method of claim 3, further comprising:

aligning the exploits data with the vulnerability data; and

cleaning the exploits data of noise and predetermined portions of the exploits data that is irrelevant to associated software

vulnerabilities.

5. The method of claim 1 , wherein certain features correspond to a known

vulnerability obtained from the plurality of datasets.

6. The method of claim 1 , further comprising testing the classification models by applying additional training data and one or more algorithms and evaluation metrics to optimize the classification models until the classification models compute the likelihood of exploitation according to a predefined error rate.

7. The method of claim 1 , further comprising vectorizing text features derived from the plurality of datasets using term frequency-inverse document frequency to create a vocabulary of associated words.

8. The method of claim 1 , further comprising:

sorting vulnerabilities associated with the plurality of datasets

according to time;

training the classification model using the training data, the training data defining a first subset of the plurality of datasets associated with a predetermined period of time; and

testing the classification model using a second subset of the plurality of datasets associated with the predetermined period of time.

9. The method of claim 1 , further comprising computing mutual information from the plurality of datasets informative as to what information a given feature provides about another feature.

10. A computer-readable medium comprising instructions that cause a

programmable processor to:

generate a learned function referencing features associated with a plurality of datasets defining software vulnerabilities and at least one machine learning algorithm; and

evaluate accuracy of the learned function by applying a portion of the plurality of datasets associated with software vulnerabilities to the learned function.

1 1 . The computer-readable medium of claim 10 comprising additional instructions that cause the programmable processor to:

implement a random forest as part of the at least one machine learning algorithm that combines bagging for each tree with random feature selection at each node to split data utilized by the random forest, such that a result of implementing the random forest is an ensemble of decision trees each having their own independent opinion on class labels for a given disclosed vulnerability.

12. The computer-readable medium of claim 10 comprising additional instructions that cause the programmable processor to:

detect, from the plurality of datasets, vulnerabilities that appear before an associated exploitation date.

13. The computer-readable medium of claim 10 comprising additional instructions that cause the programmable processor to:

access features from the plurality of datasets that contain measures computed from social connections of users posting hacking- related content.

14. The computer-readable medium of claim 13 comprising additional instructions that cause the programmable processor to:

access features from the plurality of datasets that measure a centrality of the users in a social graph.

15. A computing device, configured via machine learning to apply a learned

function derived from at least one machine learning algorithm and a plurality of datasets associated with software vulnerabilities to data associated with a software vulnerability to estimate a likelihood of exploitation of the software vulnerability.