PATENTSCOPE will be unavailable a few hours for maintenance reason on Tuesday 19.11.2019 at 4:00 PM CET
Search International and National Patent Collections
Some content of this application is unavailable at the moment.
If this situation persists, please contact us atFeedback&Contact
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

A METHOD TO PREDICT AND RANK NEW EDGES A NETWORK BACKGROUND OF THE INVENTION The present invention relates to network The present invention more particularly relates to a method for edges in a Even more the present invention relates to a method for ranking new edges in a The present invention thus relates to computer science more to techniques for social interaction and the A network may be defined as a collection of connected The objects are referred to as nodes or and are usually represented in graphical notation as The connections between the nodes are referred to as edges and are usually drawn as lines between Predicting edges in a network using graph theory is a known topic in the field of computer Graph theory is the study mathematical structures used to pair relations between A graph in context is made up of vertices or nodes or points and edges or arcs or lines that connect A graph may be meaning that there distinction between two vertices associated with each or its edges may be with a preferred direction from one node to Recently there has been interest in using graph theory to predict edges in The increasing interest is motivated different applications such as internet page ordering in response to in social interaction Many methods for predicting new edges a network rely on the similarity between the nodes of a where node similarity can be defined many A limitation here that these do not necessarily take advantage of the topology of the Other methods use the network topology and different forms of accounting and using edge An example is random methods that paths by starting a specific node and moving around until a target node is reached A limitation these methods that information utilized in different algorithms still to how the traversal of the the weights of edges and the final score of a predicted edge are When new edges in a network are ranking a predicted edge relative to other predicted edges from the of accuracy is important as predicting all or most of edges could lead to a large list of Having a rank and consequently order in this list is a practical requirement in many applications Depending on how the methods define their way of finding new a score they produce can be used to rank the SUMMARY OF THE INVENTION The present invention is directed to a method for predicting new edges a More the present invention aims to provide a method for determining the likelihoo an edge arise two nodes that are currently not associated directly with one that between two nodes that do not currently define an edge of the The present invention provides a that assigns a confidence value to each potential new Each confidence vaiue or score measures the degree of potentially meaningful connection between two nodes that do not currently define an edge of the The confidence or score for two such nodes s and t is derived determining cyclic paths that exist between the two nodes s and t in the network at time of the measurement or ranking where network nodes are linked by edges that have assigned weights represent the degree of relationships between the such but not degree and number shared and so The confidence score assigned to a candidate edge determined by using a function F set forth that combines the weights of paths between the nodes s and t The analysis of paths enables the method to account for the effects of ionger path iengths by ascribing a valuation that decreases increasing pat A path is one in which any node appears only once along the in one embodiment the invention provides a method for determining confidence scores for new edges in a network by finding all paths between any two nodes s and t the network that are not currently connected by an where network nodes are linked by edges that have assigned weights that represent the degree of relationships between the such but limited degree of number of shared and so where the candidate edge is assigned a confidence score that the edge represents a meaningful connection between s and and where this score is determined using a function part of the that combines the weights of all paths accounting for the effect of increasing penalty for In another the invention for ever s from a subset S of network the predicted edges for every t from a subset T of network nodes at least any two nodes s and f that are not currently connected by an where ranking is based on the confidence score derived from all paths connecting s and i and where the confidence of the correct edge prediction is higher if the magnitude of the confidence is BRIEF DESCRIPTION OF THE DRAWING The sole figure o the drawing is a flowchart showing the steps the present DETAILED consider a set of nodes S and a set of nodes T A grap is where nodes S and T are Note nodes from both S and T sets need not be mutually The degree of similarity betwee two nodes is a score between and can be defined in if there is any such similarity between the they are connected by an which is assigned a weight represented by the similarity A known between any two nodes represents an edge between them with a weight of A known iink here represents a known direct between the For example in a interaction this would correspond to the confirmed interaction between the two Such network representation allows our method to gain more information by combining the similarity information and the topology of the As shown in the the present method includes four in step S1 one selects a set S of nodes and a set T of in step one finds paths from every node s from S to every t in or at least from every s to every node t between which there is no in step on determines the confidence score of each potential edge on Equation In step for each node from S one ranks ail potential based the confidence where t is from The present method contemplates the traversal of ail paths between any node s S and any node t from particularly any node t that is currently not to node This is accomplished by using a modification keep track of the visited of any traversal For implementing search as a recursive function that traverses the graph moving along the nodes are marked as they are visited in the and then the marks are removed just before returning from the recursive This way one ensures that no nodes are visited more than once in a For the application in this study we considered paths up to the selected length that is positive example the paths up to three cannot have more three This constraint also significahti decreases the time required to find such The assumption behind present method is that a node from S and a node from T have a higher probabilit to mutuaiiy directly link if there are more path connecting since the paths represent the confidence of meaningful links between the As the paths can vary in it is contempiated in ionger paths shouid carry iess confidence for a link to occur between the end nodes of the we introduce a decay function F gives iess support for paths as the of the path Equation defines how vve calculate the confidence score from these different paths for a selected pair of nodes s from S and t from as where P is the set of the paths connecting a specific node s from S and a node t from is the weight of the edge j along an individual path and is the number of edges in Function g can be any increasing function our example we used the form 3 where is a parameter that could be assigned values to each network for obtaining better prediction Our method is unique that it uses the similarities between nodes from S and between nodes from T along with the of the graph to leverage more information about potential node in most of the studies utilizing the network structure ail paths over contribute equally to the confidence score while we apply a decreasing function F so that longer paths get lower total Example To test the we applied it to the problem of predicting interaction of drugs with their target We intend to predict new interactions in a network and then rank those new predicted Although our method can be used with any network we use this as an to illustrate its performance and demonstrate its utility for practical We used a dataset that represent enzymes that are a major family of drug The dataset consists of 445 enzymes that interact with these and known interactions between these drugs and these Similarity between drugs was computed using tool while the similarity between enzymes was computed using normalized version of algorithm this example we set S as the set of drugs and set T as the set of We evaluated our method this example dataset based on cross validation where we remove a known between a drug and its interacting enzyme and then try to predict We then apply our method starting from that drug to all enzymes from set T to get the list of candidates and then rank all links based on the scores calculated from Equation using where try to see we can predict the removed link in the several top ranked predictions 1 Top 2 and Top We show our method achieves very good recognition of known interactions as presented in Table Percentage of correct prediction of known interactions between drugs and using LOOCV over the dataset j Dateie j To 1 j 2 Top 5 References and relational Seaming for workshop on learning statisticaS from relational and The problem for social Journal of the American Society Informaiion Science and and New perspectives and methods in in Proceedings of the 16th ACM international on Knowledge discovery and data and Link prediction in complex A Physica Mechanics and and Drug target predictions based on heterogeneous graph Pac Symp et Prediction of interaction networks from the integration of chemical and genomic et Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic Journal of the American Chemical and Identification of Common Molecular Journal of Molecular insufficientOCRQuality