How do persistent homology and clustering methods for data point clouds differ? I'm specifically interested in the application to gene expression data of cancer patients, but any example works.
I understand that a hierarchical clustering method might impose 'nearness' on two points (patients) that are ancestrally related (have similar gene expression profiles), but in order to find the interesting groups you have to manually choose a threshold for nearness, and this criterion can often obscure the interesting biology (e.g. a few of the patients that survived are mixed in with some that died based on nearness of gene expression values; with the enormity of some datasets, the small details that would set a few special patients apart from a large sub-group are lost).
From what I gather, persistent homology, with an appropriately chosen metric, has better results because it doesn't partition the data and break it apart to find the interesting sub-groups. Rather, it plays with the geometry of the data in some way that allows for the elucidation of these details that clustering methods would generally obscure. How does this work, generally?