As long as you are interested, you can enroll to the course. Please open this page for details.
Graph Theory Introduction, Graph Mining Introduction, Network Properties, Random Graphs, Small World Graphs, Node Importance, Node Similarity, Clustering & Community Detection, Link Prediction, Anomaly detection, Time Evolving Graphs, Influence/Virus Propagation, Graph Mining Use Cases, Big Data Graph Databases, Big Data Graph Processing, GraphX, Neo4j, Feature Based Classification, Graph Classification, Graph Kernels, Label Propagation
Chat with each other to discuss the course topics. Everyday at 22:00 pm EST. Discuss only relevant topics. Chat messages are visible to others even past chat messages.
Focus on the course topics and related, be respectful and collaborative
Assignments: Three (3) : 3 * 15% = 45%, Project: 30%, Class Participation: 15% (Attendance and Participation - 7%, Quizzes, and Reflection - 8%), Final Review Exam: 10% (Subject to revise over time)
Might include: Primarily Quiz, Reflection for Each Topic, Small Code Test, Short Questions and Answers, Tests on understanding how to apply the concepts. Details will be provided at assessment times.
For certificate, participants/students must have to prove their knowledge, and skills where multiple attempts will be allowed. We will make sure, if we issue a certificate, you are qualified to get the certificate, and you will be given multiple opportunities to prove yourselves. Though each time the question set will be different.
Quiz : Pre Test (i.e. Start of the class). Future Work
Quiz : Post Test : End of the class. Future work
Your learning and your relation with today's topics: Write down in a document on what did you learn from Today's class. What was the most interesting/useful/important topics for today (in your judgement). What Psychomotor skill did you achieve today? How will you relate today's topic to real-world i.e. industry and research? What ethical and legal considerations are there for today's topics? How can you relate your past to this topic, where you can or want to take the topics for today. Any relevant thoughts? what more could be related to these topics in the class?
https://link.springer.com/article/10.1007/s41019-019-00105-0
Graphs are everywhere. https://pdfs.semanticscholar.org/144b/130323cb94d618c2c5e66982a56f31d36396.pdf
https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMining-13a-AnomalyDetection.pdf
Graph based Anomaly Detection and Description: A Survey
Some other topics before relates to this topic. https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMining-02-Social-Network-Analysis.pdf
https://hpi.de/fileadmin/user_upload/fachgebiete/mueller/courses/graphmining/GraphMining-04-FrequentSubgraph.pdf
You will see many datasets on the URL: You can apply your implementation
on a dataset/graph of your choice though make sure the properties
(directed/undirected/weighted/unweighted/connected/not-connected and
others) match with the question.
https://networkx.github.io/documentation/latest/_downloads/networkx_reference.pdf
https://networkx.github.io/documentation/stable/reference/introduction.html
Click the above link for details: BFS, DFS, Single Source Shortest Path, All pair shortest path, Karger's Algorithm, Min-Cut Algorithm, Cliques Algorithms, Dijiktra’s Shortest Path implementation (for shortest paths or the longest paths)
More will be added later .. any algorithm that you will come across can be an assignment as well. You might try to study and implement the most important algorithms that are used practically and are famous (or solve important problems)
Assignments related to demonstrating the capability to be able to use the Graph Algorithms provided in the NetworkX library. Such as find the related library methods/algorithm as mentioned in list 1 and apply on the same datasets : do you get the same results.
In Short: Implement Spanning Tree, and Highly Connected Subgraphs
Given a graph (use a data-set or
a small graph first) then apply on large graphs. Write Python or R code
to Identify the isolated nodes in the graph, count bi-directional
edges in the graph, identify top 10 vertices based on in-degrees,
identify top 100 vertices based on their out-degrees, count the number
of cliques in the Graph, identify number of disconnected subgraphs.
For a Graph/graph-dataset such as political blogs (see example dataset section), implement Vertex Betweenness, Edge Betweenness, Closeness Centrality.
Implement Page ranking Algorithms such as HITS and Anchor/Hubs
Part 4: Case studies
Click above and Check the Data Science Competitions to get Project Ideas. For example, can you map current dengue spread into a graph, and predict the path for how the disease will spread?
You can as well do research in any of these areas to improve the algorithms and performance as well as apply on new problems/challenges.
Google Job Areas: Large-Scale
Balanced Partitioning: Example Google Maps Driving Directions,
Large-Scale Clustering:clustering graphs at Google scale, Large-Scale
Connected Components, Large-Scale Link Modeling: similarity ranking and
centrality metrics: link prediction and anomalous link discovery.,
Large-Scale Similarity Ranking: Personalized PageRank, Egonet
similarity, Adamic Adar, and others, Public-private Graph Computation,
Streaming and Dynamic Graph Algorithms, ASYMP: Async Message Passing
Graph Mining, Large-Scale Centrality Ranking, Large-Scale Graph Building
Take one of these Google job/function areas in Graph Mining and improve the
algorithms for performance as well as apply on new problems/challenges.
Google Job Areas: Large-Scale Balanced Partitioning: Example Google Maps Driving Directions, Large-Scale Clustering:clustering graphs at Google scale, Large-Scale Connected Components, Large-Scale Link Modeling: similarity ranking and centrality metrics: link prediction and anomalous link discovery., Large-Scale Similarity Ranking: Personalized PageRank, Egonet similarity, Adamic Adar, and others, Public-private Graph Computation, Streaming and Dynamic Graph Algorithms, ASYMP: Async Message Passing Graph Mining, Large-Scale Centrality Ranking, Large-Scale Graph Building
https://github.com/sayedum/spark-implementation-louvian-modularity.git.
This is a private repository. I might give access to it to selected participants. You have to request for it. I need to know what will you do with it. This utilized PySpark, Spark GraphFrames on Hadoop Platforms. There is a non-spark, non-parallel, Python implementation as well. With some extensions and trying to answer the right question - this has the potential to become a research publication as well.
https://github.com/Sotera/spark-distributed-louvain-modularity.
Not my implementation. GraphX is kind of older than GraphFrame.
https://www.geeksforgeeks.org/page-rank-algorithm-implementation/ . Not my implementation. I might share code blocks from my implementation.
https://www.geeksforgeeks.org/betweenness-centrality-centrality-measure/
However, it will be the best, first you try to implement on your own. Better that you just don't memorize; however, try to earn the capability to convert textual concept/algorithm (mathematics) into code.
https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/
https://www.geeksforgeeks.org/floyd-warshall-algorithm-dp-16/
Details: http://goatleaps.xyz/programming/kargers-algorithm.html Code File: http://goatleaps.xyz/assets/code/kargers_mincut.py