DGEclust is a program for clustering and differential expression analysis of expression data generated by next-generation sequencing assays, such as RNA-seq, CAGE and others. It takes as input a table of count data and it estimates the number and parameters of the clusters supported by the data. The estimated cluster configurations can be post-processed in order to identify differentially expressed genes and for generating gene- and sample-wise dendrograms and heatmaps.
Internally, DGEclust uses a Hierarchical Dirichlet Process Mixture Model (HDPMM) for modelling over-dispersed count data, combined with a blocked Gibbs sampler for efficient Bayesian learning. You can find more technical details on the statistical methodologies used in this software in the following papers:
If you find this software useful, please cite the second paper, above. For more information and bug reports, you can send an email to Dimitris Vavoulis or Julian Gough.
Enjoy!