Graph Mining: Link Prediction. Learn by finding answers to the following questions.
What is Link Prediction in Graph Mining?
How can Link Prediction help? and where? can it help in social network, research authoriship, or spread of disease
Give uses of Link Prediction in epidemiology or any medical/biology application?
Give uses of Link Prediction in Social Network ?
Give uses of Link Prediction in co-authorship?
What are the steps in Link Prediction? i.e. define link prediction problem.
Define Link Prediction Problem formally i.e. with Notations i.e. with Graph notations i.e. Graph Representations.
How is node similarity related in Link Prediction?
What are possible Node Similarity algorithms that can be used for the purpose?
Which nodes are the most possible new links? think interms of Node Similarity Index (Highest, Lowest)
What are the types of Node Similarity? Node proximity?
What is Node Similarity based upon? i.e. Network Topology
What is Local Structure for Node Similarity?
What is Global Structure for Node Similarity?
What are examples of Local Structure for Node Similarity?
What are examples of Global Structure for Node Similarity?
Define Node Neighborhoods. Is it local or global structure?
What is Preferential Attachment Index? Is it LocaL or global structure
What are the types of Node Neighborhoods?
If two nodes do not have any common node -- what is the probable similarity (Node Neigh. aspect)?
What is Common Neighbors ? Is it local or global structure? is it Node Neighbor or Preferential Attachment? and why?
What is Jaccard Coefficient ? Is it local or global structure? is it Node Neighbor or Preferential Attachment? and why?
What is Adamic-Adar? Is it local or global structure? is it Node Neighbor or Preferential Attachment? and why?
What is the name of the Node Similarity approach where Sum of the inverse logarithmic degree centrality of the neighbors shared by the two nodes are counted?
What is the name of the Node Similarity approach where the ratio of seize of set intersection to the set union is counted?
What is the name of the Node Similarity approach where the count of common nodes are used?
If node similarity score is counted as multiplication of the number/count of outgoing edges for a node pair. And then the higher results are assumed to create new links. What is this approach called?
When similarity scores are calculated based on global link structure of graph - what is this called local structure or global structure?
What is an example of Global Structure?
What are the examples of Global Structure?
What is Kartz Index for Global Structure?
What is Simrank for Global Structure?
When Node Similarity is calculated as: Sum of count of all paths between node pairs - what is this approach called? Then how is the link prediction made?
When Node Similarity is calculated as: two nodes are similar if they are referred by similar nodes. What is the name?
How can you measure if your implemented link prediction algorithm is great or not?
Can you use train and test concept for the measurement? Can you define the steps/problems formally? with Graph Notations?
What are some measures to calculate in the train/test approach?
Answers:
What is the name of the Node Similarity approach where Sum of the inverse logarithmic degree centrality of the neighbors shared by the two nodes are counted?
Ans: Adamic-Adar
What is the name of the Node Similarity approach where the ratio of seize of set intersection to the set union is counted?
Ans: Jaccard Coefficient
What is the name of the Node Similarity approach where the count of common nodes are used?
Ans: Common Neighbors
If node similarity score is counted as multiplication of the number/count of outgoing edges for a node pair. And then the higher results are assumed to create new links. What is this approach called?
Ans: Preferential Attachment Index
What is an example of Global Structure?
Ans: Path Length > 2
What are the examples of Global Structure?
Shortest Paths – use inverse of distance as similarity, Kartz Index, SimRank
What are some measures to calculate in the train/test approach?
Accuracy, F1-score, Sensitivity, Most metrics that would work for classification
Resources
Link Prediction
https://paperswithcode.com/task/link-prediction
Similarity Index based Link. Prediction Algorithms in Social Networks: A Survey
https://pdfs.semanticscholar.org/8e72/fa77f3d788f3c67da1e1c6347c3aaf280723.pdf
Proximity-based Methods for Link Prediction
https://cran.r-project.org/web/packages/linkprediction/vignettes/proxfun.html
Evaluating Link Prediction Methods
https://arxiv.org/pdf/1505.04094.pdf
Link Prediction Algorithm
http://be.amazd.com/link-prediction/
Evaluating link prediction methods
https://www3.nd.edu/~dial/publications/yang2015evaluating.pdf