Synthetic and real biological datasets are used to demonstrate the benefits, and also some of the perils, of analytical cluster validation.

Availability: The software used in the experiments is available at Contact J.[email protected] Supplementary information: Enlarged colour plots are provided in the Supplementary Material, which is available at The exploration of complex datasets, for which no or very little information about the underlying distribution is available, fundamentally relies on the identification of ‘natural’ group structures in the data, a task which may be tackled using clustering techniques (Duda 2001, Everitt 1993, Hastie 2001, Jain 1999).

