Graph is a fundamental concept in Computer Science. We see it everywhere and it became increasingly popular among developers when it became apparent that Facebook is nothing but a gigantic graph.
Graph is a mathematical concept that is at the core of Graph Theory. It helps in modelling objects where certain pairs of objects are somehow ‘related’. We call each object a vertex and the relationship between a pair of vertices is called an edge.
Based on the above, a graph (G) is a set of vertices (V) and edges (E). Thats it! Really, that is all that is to a graph in its very basic form. Special cases of graphs are used to represent Trees (directed acyclic graphs).
Let’s start writing a graph in C#
Vertex
As discussed above, a vertex is an object of interest in some sense. Since, we are writing a bare-bone graph, we will write a generic Vertex.
public class Vertex<T> { /// <summary> /// The data at this vertex. /// </summary> public T Data { get; set; } public Vertex(T data) { this.Data = data; } }
The above enables us to use any suitable data structure as a vertex.
Edge
The edge represents a relationship between a pair of vertices. For personal preferences, I like to have names in this relationship such that at one end is a ‘parent’ vertex and at the other end is a ‘child’ vertex. This may seem like as if I am introducing some notion of direction, but it is purely for my ease of reasoning and naming things better than vertex1 and vertex2. Let’s look at a generic Edge class.
public class Edge<T> { /// <summary> /// The parent/source/origin of the edge. /// </summary> public Vertex<T> Parent { get; set; } /// <summary> /// The child/sink/destination of the edge. /// </summary> public Vertex<T> Child { get; set; } public Edge(Vertex<T> parent, Vertex<T> child) { this.Parent = parent; this.Child = child; } }
The code is simple, we initialize an edge by providing it a pair of vertices – since there is no point to have an edge defined that has no vertices.
Graph
Let’s get to the meat of the issue and stitch our classes into a graph. As mentioned earlier, a graph is nothing but a collection of vertices and edges. Below is the code for describing just that. I use an ObservableCollection to maintain my vertices. This helps me to get notified whenever a new vertex is added or an existing one is removed. I use this notification to purge edges where the deleted vertex was participating. It is a design decision which makes delete a bit heavy. An alternate could be to update edges on read which will make deletion lightweight but reads heavy. In most practical scenarios we would anticipate more reads than deletes, so our choice is a thoughtful trade-off.
public class Graph<T> : IDisposable { private bool isDisposed = false; private ObservableCollection<Vertex<T>> vertices; private IList<Edge<T>> edges; /// <summary> /// The set of vertices for the graph. /// </summary> public ObservableCollection<Vertex<T>> Vertices { get { this.ThrowIfDisposed(); return this.vertices; } private set { this.vertices = value; } } /// <summary> /// The set of edges for the graph. /// </summary> public IList<Edge<T>> Edges { get { this.ThrowIfDisposed(); return this.edges; } private set { this.edges = value; } } public Graph() { this.Vertices = new ObservableCollection<Vertex<T>>(); this.Edges = new List<Edge<T>>(); // Register notification call to make sure we purge // orphaned edges when a vertex is removed. this.Vertices.CollectionChanged += VerticesOnCollectionChanged; } /// <summary> /// Disposes off the resources in this object. /// Uses Basic Dispose pattern. /// See https://msdn.microsoft.com/en-us/library/b1yfkh5e(v=vs.110).aspx#basic_pattern /// </summary> public void Dispose() { this.Dispose(true); GC.SuppressFinalize(this); } /// <summary> /// Only cleanup if called from IDisposable.Dispose. /// We do this by setting the disposing flag to true. /// </summary> /// <param name="disposing">If set to true means it was invoked via IDisposable.Dispose()</param> protected virtual void Dispose(bool disposing) { if (this.isDisposed) { return; } if (!disposing) { return; } this.DeregisterVerticesCollectionChanged(); this.isDisposed = true; } /// <summary> /// Deregisters the call back on vertices event handler. /// </summary> private void DeregisterVerticesCollectionChanged() { if (this.Vertices != null) { this.Vertices.CollectionChanged -= this.VerticesOnCollectionChanged; } } /// <summary> /// The event handler to notify change in vertices for this graph. /// </summary> /// <param name="sender">The sender.</param> /// <param name="notifyCollectionChangedEventArgs">The <see cref="NotifyCollectionChangedEventArgs"/> object.</param> private void VerticesOnCollectionChanged(object sender, NotifyCollectionChangedEventArgs notifyCollectionChangedEventArgs) { // We are only interested in Remove, Replace or Reset action if ((notifyCollectionChangedEventArgs.Action != NotifyCollectionChangedAction.Remove) && (notifyCollectionChangedEventArgs.Action != NotifyCollectionChangedAction.Replace) && (notifyCollectionChangedEventArgs.Action != NotifyCollectionChangedAction.Reset)) { return; } IList<Vertex<T>> oldItems = this.CastFromReadonlyList(notifyCollectionChangedEventArgs.OldItems); IList<Vertex<T>> newItems = this.CastFromReadonlyList(notifyCollectionChangedEventArgs.NewItems); // Lets get the list of all deleted vertices foreach (Vertex<T> removedVertex in this.Difference(oldItems, newItems)) { // Lets get the list of all orphaned edges // It is the edge where the deleted vertex was at either end. IList<Edge<T>> toBeDeletedEdges = this.Edges.Where( x => x.Parent == removedVertex || x.Child == removedVertex).ToList(); foreach (Edge<T> toBeDeletedEdge in toBeDeletedEdges) { this.Edges.Remove(toBeDeletedEdge); } } } /// <summary> /// Converts <see cref="IList"/> to <see cref="IList{T}"/> for easier manipulation. /// Returns an empty <see cref="IList{T}"/> if the source <see cref="IList"/> is null or empty. /// </summary> /// <param name="sourceList">The <see cref="IList"/> to cast from.</param> /// <returns>A <see cref="IList{T}"/>.</returns> private IList<Vertex<T>> CastFromReadonlyList(IList sourceList) { IList<Vertex<T>> returnValue = new List<Vertex<T>>(); if (sourceList == null) { return returnValue; } foreach (var item in sourceList) { returnValue.Add((Vertex<T>) item); } return returnValue; } /// <summary> /// Performs the difference operation on two sets. /// </summary> /// <param name="oldItems">The old items. This is the set the difference will be computed from.</param> /// <param name="newItems">The new items. This is the set that will be searched for <see cref="oldItems"/>.</param> /// <returns>A <see cref="IEnumerable{T}"/> of items that were found in <see cref="oldItems"/> but not in <see cref="newItems"/>.</returns> private IEnumerable<Vertex<T>> Difference(IList<Vertex<T>> oldItems, IList<Vertex<T>> newItems) { VerifyArgument.NotNull(oldItems, nameof(oldItems)); VerifyArgument.NotNull(newItems, nameof(newItems)); foreach (Vertex<T> item in oldItems) { if (newItems.Contains(item)) { continue; } yield return item; } } /// <summary> /// Throws if the object is being disposed. /// </summary> /// <param name="publicMemberName">The name of public member that was called.</param> private void ThrowIfDisposed([CallerMemberName] string callerMemberName = "") { if (this.isDisposed) { throw new ObjectDisposedException( $"Attempting to access {callerMemberName} of an instance of {nameof(Graph<T>)} that has been disposed."); } } }
For deleting edges upon deletion of vertices, we first get a list of all vertices that are removed, next for each of the removed vertices, we find all edges where the removed vertex is either a parent or a child and remove them.
The most common operations on a graph is how we traverse it. Two commons traversal techniques are discuss below.
Breadth First Traversal
In breadth first traversal on a graph, we start with a given vertex and traverse it’s children. Then for each child, we traverse that child’s children and so on until we can’t explore any new vertices. We use an in-memory list to maintain a visit log and an in-memory queue to hold the children that haven’t been traversed. We can of-course abstract these in-memory collections so that they are persisted to disk in case of very large graphs. This particular implementation kind of enforces a directional constraint on our graph since it only traverses children and not parent. We can easily add another loop to enumerate all such edges where our current vertex is a child and enqueue all parents to make it nondirectional.
public IEnumerable<T> TraverseBreadthFirst(Vertex<T> vertex) { this.ThrowIfDisposed(); VerifyArgument.NotNull(vertex, nameof(vertex)); if (!this.Vertices.Contains(vertex)) { throw new ArgumentException("The provided vertex is not part of this graph", nameof(vertex)); } IList<Vertex<T>> visitedList = new List<Vertex<T>>(); Queue<Vertex<T>> notVisited = new Queue<Vertex<T>>(); notVisited.Enqueue(vertex); do { Vertex<T> current = notVisited.Dequeue(); visitedList.Add(current); yield return current.Data; foreach (Edge<T> edge in this.Edges.Where(x => x.Parent == current)) { if (visitedList.Contains(edge.Child)) { // Already visited. continue; } if (notVisited.Contains(edge.Child)) { // Already in the queue // Although adding it again won't hurt but saves memory. continue; } notVisited.Enqueue(edge.Child); } } while (notVisited.Count > 0); }
Depth First Traversal
In depth first traversal, we start with a given vertex and traverse a branch until there are no more expansions possible, then we backtrack and expand the lineage of next child, when all siblings are done, we backtrack one level up again and continue doing so until there are no more edges left linking to new children. We use an in-memory list to keep track of vertices we have visited and a stack for the not yet visited.
public IEnumerable<T> TraverseDepthFirst(Vertex<T> vertex) { this.ThrowIfDisposed(); VerifyArgument.NotNull(vertex, nameof(vertex)); if (!this.Vertices.Contains(vertex)) { throw new ArgumentException("The provided vertex is not part of this graph", nameof(vertex)); } IList<Vertex<T>> visitedList = new List<Vertex<T>>(); Stack<Vertex<T>> notVisited = new Stack<Vertex<T>>(); notVisited.Push(vertex); do { Vertex<T> current = notVisited.Pop(); visitedList.Add(current); yield return current.Data; // We add the children in a list and traverse them in reverse order. // This helps us in traversing the oldest child first // Since it would be on the top of the stack. // It is not needed but a convenient way of thinking // that we evaluate graphs from left to right. IList<Edge<T>> children = this.Edges.Where(x => x.Parent == current).ToList(); for(int i = children.Count - 1; i >= 0; i--) { if (visitedList.Contains(children[i].Child)) { // Already visited. continue; } if (notVisited.Contains(children[i].Child)) { // Already in the stack // Although adding it again won't hurt but saves memory. continue; } notVisited.Push(children[i].Child); } } while (notVisited.Count > 0); }