26.3 Maximum bipartite matching

Some combinatorial problems can easily be cast as maximum-flow problems. The multiple-source, multiple-sink maximum-flow problem from Section 26.1 gave us one example. There are other combinatorial problems that seem on the surface to have little to do with flow networks, but can in fact be reduced to maximum-flow problems. This section presents one such problem: finding a maximum matching in a bipartite graph (see Section B.4). In order to solve this problem, we shall take advantage of an integrality property provided by the Ford-Fulkerson method. We shall also see that the Ford-Fulkerson method can be made to solve the maximum-bipartite-matching problem on a graph G = (V, E) in O(V E) time.

The maximum-bipartite-matching problem

Given an undirected graph G = (V, E), a matching is a subset of edges M ⊆ E such that for all vertices v ∈ V, at most one edge of M is incident on v. We say that a vertex v ∈ V is matched by matching M if some edge in M is incident on v; otherwise, v is unmatched. A maximum matching is a matching of maximum cardinality, that is, a matching M such that for any matching M', we have |M| ≥ |M'|. In this section, we shall restrict our attention to finding maximum matchings in bipartite graphs. We assume that the vertex set can be partitioned into V = L ∪ R, where L and R are disjoint and all edges in E go between L and R. We further assume that every vertex in V has at least one incident edge. Figure 26.7 illustrates the notion of a matching.

Figure 26.7: A bibartite graph G = (V, E) with vertex partition V = L∪R. (a) A matching with cardinality 2. (b) A maximum matching with cardinality 3.

The problem of finding a maximum matching in a bipartite graph has many practical applications. As an example, we might consider matching a set L of machines with a set R of tasks to be performed simultaneously. We take the presence of edge (u, v) in E to mean that a particular machine u ∈ L is capable of performing a particular task v ∈ R. A maximum matching provides work for as many machines as possible.

Finding a maximum bipartite matching

We can use the Ford-Fulkerson method to find a maximum matching in an undirected bipartite graph G = (V, E) in time polynomial in |V| and |E|. The trick is to construct a flow network in which flows correspond to matchings, as shown in Figure 26.8. We define the corresponding flow network G' = (V', E') for the bipartite graph G as follows. We let the source s and sink t be new vertices not in V, and we let V' = V ∪ {s, t}. If the vertex partition of G is V = L ∪ R, the directed edges of G' are the edges of E, directed from L to R, along with V new edges:

{(s, u) : u ∈ L}

∪{(u, v) : u ∈ L, v ∈ R, and (u, v) ∈ E}

∪{(v, t) : v ∈ R}.

Figure 26.8: The flow network corresponding to a bipartite graph. (a) The bipartite graph G = (V, E) with vertex partition V = L ∪ R from Figure 26.7. A maximum matching is shown by shaded edges. (b) The corresponding flow network G' with a maximum flow shown. Each edge has unit capacity. Shaded edges have a flow of 1, and all other edges carry no flow. The shaded edges from L to R correspond to those in a maximum matching of the bipartite graph.

To complete the construction, we assign unit capacity to each edge in E'. Since each vertex in V has at least one incident edge, |E| ≥ |V|/2. Thus, |E| ≤ |E'| = |E| + |V| ≤ 3|E|, and so |E'| = Θ(E).

The following lemma shows that a matching in G corresponds directly to a flow in G's corresponding flow network G'. We say that a flow f on a flow network G = (V, E) is integer-valued if f (u, v) is an integer for all (u, v) ∈ V × V.

Lemma 26.10

Let G = (V, E) be a bipartite graph with vertex partition V = L ∪ R, and let G' = (V', E') be its corresponding flow network. If M is a matching in G, then there is an integer-valued flow f in G' with value |f| = |M|. Conversely, if f is an integer-valued flow in G', then there is a matching M in G with cardinality |M| = |f|.

Proof We first show that a matching M in G corresponds to an integer-valued flow in G'. Define f as follows. If (u, v) ∈ M, then f (s, u) = f (u, v) = f(v, t) = 1 and f(u, s) = f(v, u) = f(t, v) = -1. For all other edges (u, v) ∈ E', we define f(u, v) = 0. It is simple to verify that f satisfies skew symmetry, the capacity constraints, and flow conservation.

Intuitively, each edge (u, v) ∈ M corresponds to 1 unit of flow in G' that traverses the path s → u → v → t. Moreover, the paths induced by edges in M are vertex-disjoint, except for s and t. The net flow across cut (L ∪ {s}, R ∪ {t}) is equal to |M|; thus, by Lemma 26.5, the value of the flow is |f| = |M|.

To prove the converse, let f be an integer-valued flow in G', and let

M = {(u, v) : u ∈ L, v ∈ R, and f(u, v) > 0}.

Each vertex u ∈ L has only one entering edge, namely (s, u), and its capacity is 1. Thus, each u ∈ L has at most one unit of positive flow entering it, and if one unit of positive flow does enter, by flow conservation, one unit of positive flow must leave. Furthermore, since f is integer-valued, for each u ∈ L, the one unit of flow can enter on at most one edge and can leave on at most one edge. Thus, one unit of positive flow enters u if and only if there is exactly one vertex v ∈ R such that f(u, v) = 1, and at most one edge leaving each u ∈ L carries positive flow. A symmetric argument can be made for each v ∈ R. The set M defined in the statement of the lemma is therefore a matching.

To see that |M| = |f|, observe that for every matched vertex u ∈ L, we have f(s, u) = 1, and for every edge (u, v) ∈ E - M, we have f(u, v) = 0. Consequently,

\|M\|	=	f(L, R)
	=	f(L, V') - f(L, L) - f(L, s) - f(L, t) (by Lemma 26.1).

We can simplify the above expression considerably. Flow conservation implies that f(L, V') = 0; Lemma 26.1 implies that f(L, L) = 0; skew symmetry implies that - f(L, s) = f(s, L); and because there are no edges from L to t, we have f(L, t) = 0. Thus,

\|M\|	=	f(s, L)
	=	f(s, V')	(since all edges out of s go to L)
	=	\|f\|	(by the definition of \|f\|).

Based on Lemma 26.10, we would like to conclude that a maximum matching in a bipartite graph G corresponds to a maximum flow in its corresponding flow network G', and we can therefore compute a maximum matching in G by running a maximum-flow algorithm on G'. The only hitch in this reasoning is that the maximum-flow algorithm might return a flow in G' for which some f(u, v) is not an integer, even though the flow value |f| must be an integer. The following theorem shows that if we use the Ford-Fulkerson method, this difficulty cannot arise.

Theorem 26.11: (Integrality theorem)

If the capacity function c takes on only integral values, then the maximum flow f produced by the Ford-Fulkerson method has the property that |f| is integer-valued. Moreover, for all vertices u and v, the value of f(u, v) is an integer.

Proof The proof is by induction on the number of iterations. We leave it as Exercise 26.3-2.

We can now prove the following corollary to Lemma 26.10.

Corollary 26.12

The cardinality of a maximum matching M in a bipartite graph G is the value of a maximum flow f in its corresponding flow network G'.

Proof We use the nomenclature from Lemma 26.10. Suppose that M is a maximum matching in G and that the corresponding flow f in G' is not maximum. Then there is a maximum flow f' in G' such that |f'| > |f|. Since the capacities in G' are integer-valued, by Theorem 26.11, we can assume that f' is integer-valued. Thus, f' corresponds to a matching M' in G with cardinality |M'| = |f'| > |f| = |M|, contradicting our assumption that M is a maximum matching. In a similar manner, we can show that if f is a maximum flow in G', its corresponding matching is a maximum matching on G.

Thus, given a bipartite undirected graph G, we can find a maximum matching by creating the flow network G', running the Ford-Fulkerson method, and directly obtaining a maximum matching M from the integer-valued maximum flow f found. Since any matching in a bipartite graph has cardinality at most min(L, R) = O(V), the value of the maximum flow in G' is O(V). We can therefore find a maximum matching in a bipartite graph in time O(V E') = O(V E), since |E'| = Θ(E).

Exercises 26.3-1

Run the Ford-Fulkerson algorithm on the flow network in Figure 26.8(b) and show the residual network after each flow augmentation. Number the vertices in L top to bottom from 1 to 5 and in R top to bottom from 6 to 9. For each iteration, pick the augmenting path that is lexicographically smallest.

Exercise 26.3-2

Prove Theorem 26.11.

Exercise 26.3-3

Let G = (V, E) be a bipartite graph with vertex partition V = L ∪ R, and let G' be its corresponding flow network. Give a good upper bound on the length of any augmenting path found in G' during the execution of FORD-FULKERSON.

Exercise 26.3-4: ★

A perfect matching is a matching in which every vertex is matched. Let G = (V, E) be an undirected bipartite graph with vertex partition V = L ∪ R, where |L| = |R|. For any X ⊆ V, define the neighborhood of X as

N(X) = {y ∈ V : (x, y) ∈ E for some x ∈ X},

that is, the set of vertices adjacent to some member of X. Prove Hall's theorem: there exists a perfect matching in G if and only if |A| ≤ |N(A)| for every subset A ⊆ L.

Exercise 26.3-5: ★

We say that a bipartite graph G = (V, E), where V = L ∪ R, is d-regular if every vertex v ∈ V has degree exactly d. Every d-regular bipartite graph has |L| = |R|. Prove that every d-regular bipartite graph has a matching of cardinality |L| by arguing that a minimum cut of the corresponding flow network has capacity |L|.