Computational Biochemistry Research Group 

CBRG Home
Services
Darwin
Research
Students
Search
Contact
Site Map 
 

Vertex Cover approximation algorithm

Resolution of conflict graphs

A vertex cover of a graph is a set of vertices (nodes) which covers all the edges. That is, every edge is incident to at least one vertex in the set. For example, {a,b,e} is a vertex cover of minimal size for the graph: 

example graph

The vertex cover problem is normally defined as the problem of finding the minimum size vertex cover. This problem is NP-complete, and hence very expensive to solve exactly for large graphs. 

The vertex cover problem can be used in computational biochemistry whenever we have conflicts between sequences, and the way to resolve these conflicts is by excluding some sequences from the sample. We normally define a conflict graph which is a graph where every sequence is a vertex and every edge is a conflict between two sequences. A conflict may be defined when the alignment of these two sequences has a very poor score. The goal is to remove the fewest possible sequences that will eliminate all conflicts. The set to exclude is the Vertex Cover of the conflict graph. 

The above graph can be represented, and the vertex cover computed with the following input: 

  Graph( {a,b,c,d,e,f},
         { {a,b}, {a,c}, {a,e}, {a,f}, {b,c},
           {b,d}, {b,e}, {b,f}, {c,e}, {e,f} } ):
As it can be inferred from the example, a graph is defined by a set of vertices and a set of edges. The vertices can be any sequence of characters or integers. The edges are sets of two distinct vertices. To input random graphs or other known graphs click here. The output of the above is 
The vertex cover is: {a,b,e} and is of optimal size.
In this case the lower bound coincides with the size of the answer, hence the result is correct. If it does not coincide, a lower bound is given. 

Graph definition:

Display the input graph

If you run very large graphs, you should not choose the graphical display of the input graph, as this may require more time than the computation of the vertex cover itself. 

The graphs descriptions here are a normal test for the algorithm. 

References.

A standard reference for this problem is ``Computers and Intractability'', [M. Garey and D. Johnson, W.H. Freeman, San Francisco]. The fastest exact algorithm known to us is ``An Improved Fixed Parameter Algorithm for Vertex Cover'' [R. Balasubramanian, M. Fellows and V. Raman, to appear in Information Processing Letters] running in O(n2 (1.3247...)k) time, where n is the number of nodes in the graph and k is the size of the vertex cover. The algorithm implemented in Darwin and used here has been written by Gaston H. Gonnet. It uses a combination of a pruned search of the complete tree and a greedy heuristic. For random graphs with n ln n edges, it runs in time O(n3). The lower bound is computed from a reduced graph and the cliques which can be found by enlarging 3-cliques. Mike Hallett and Ulrike Stege are theatening to replace it with better versions. I hope they will do.... 

Comments, questions and other reports should be sent to G Gonnet

Questions, problems, suggestions? Check out the 

 
Comments, questions, problems? darwin.comments@inf.ethz.ch
TOP
CBRG