Gaussian Mixture Model (GMM)
Gaussian Mixture Model (GMM) a model-based probabilistic method that assumes data points are generated from a mixture of several Gaussian distributions with unknown parameters. It uses the Expectation Maximization (EM) algorithm to update the parameters iteratively in order to optimize the log-likelihood of the data until convergence.
PhyloClustering.gmm_label
— Functiongmm_label(tree::AbstractMatrix{<:Real}, n::Int64; method::Symbol=:kmeans, kind::Symbol=:diag)
Get predicted labels from Gaussian mixture model for a group of phylogenetic trees.
Arguments
tree
: a B * N tree Matrix (each column of tree Matrix is a B-dimensional tree in bipartiton format).n
: the number of clusters.method
(defaults to:kmeans
): intialization method to find n starting centers: *:kmeans
: use K-means clustering from Clustering.jl to initialize withn
centers. *:split
: initialize a single Gaussian withtree
and subsequently splitting the Gaussians followed by retraining using the EM algorithm untiln
Gaussians are obtained.kind
(defaults to:diag
): covariance type,:diag
or:full
.
Output
A Tuple{Vector{Int64}, Vector{Int64}}
where the first Vector
contains predicted labels for each tree based on the posterior probability and the second Vector
contain predicted labels for each tree based on the Log Likelihood.
Reference
The implementation of GMM is provided by GaussianMixtures.jl
.