A vertex cover of a graph is a set of vertices (nodes) which covers all
the edges. That is, every edge is incident to at least one vertex in the
set. For example, {a,b,e} is a vertex cover of minimal size for
the graph:
The vertex cover problem is normally defined as the problem of finding
the minimum size vertex cover. This problem is NP-complete, and hence very
expensive to solve exactly for large graphs.
The vertex cover problem can be used in computational biochemistry whenever
we have conflicts between sequences, and the way to resolve these conflicts
is by excluding some sequences from the sample. We normally define a conflict
graph which is a graph where every sequence is a vertex and every edge
is a conflict between two sequences. A conflict may be defined when the
alignment of these two sequences has a very poor score. The goal is to
remove the fewest possible sequences that will eliminate all conflicts.
The set to exclude is the Vertex Cover of the conflict graph.
The above graph can be represented, and the vertex cover computed with
the following input:
As it can be inferred from the example, a graph is defined by a set of
vertices and a set of edges. The vertices can be any sequence of characters
or integers. The edges are sets of two distinct vertices. To input random
graphs or other known graphs click here.
The output of the above is
The vertex cover is: {a,b,e} and is of optimal size.
In this case the lower bound coincides with the size of the answer, hence
the result is correct. If it does not coincide, a lower bound is given.
If you run very large graphs, you should not choose the graphical display
of the input graph, as this may require more time than the computation
of the vertex cover itself.
The graphs descriptions here
are a normal test for the algorithm.
References.
A standard reference for this problem is ``Computers and Intractability'',
[M. Garey and D. Johnson, W.H. Freeman, San Francisco]. The fastest exact
algorithm known to us is ``An Improved Fixed Parameter Algorithm for Vertex
Cover'' [R. Balasubramanian, M. Fellows and V. Raman, to appear in Information
Processing Letters] running in O(n2 (1.3247...)k)
time, where n is the number of nodes in the graph and k is
the size of the vertex cover. The algorithm implemented in Darwin and used
here has been written by Gaston H. Gonnet. It uses a combination of a pruned
search of the complete tree and a greedy heuristic. For random graphs with
n ln n edges, it runs in time O(n3).
The lower bound is computed from a reduced graph and the cliques which
can be found by enlarging 3-cliques. Mike Hallett and Ulrike Stege are
theatening to replace it with better versions. I hope they will do....
Comments, questions and other reports should be sent to G
Gonnet.