Pre-process data before input models
PhyloClustering.standardize_tree
— Functionstandardize_tree(tree::AbstractMatrix{<:Real})
Standardize tree Matrix that returned by split_weight
. It is recommended to standardize the data before inputting it into the model.
Arguments
tree
: a N * B Matrix containing trees (each row is a B-dimensional tree in bipartiton format).
Output
A standardized B * N tree Matrix
with a mean of about 0 and a standard deviation of about 1. This tree Matrix
can be the input of model.
PhyloClustering.distance
— Functiondistance(tree::AbstractMatrix{<:Real})
Get the distance Matrix of a tree Matrix returned by split_weight
.
Arguments
tree
: a B * N tree Matrix (each column of tree Matrix is a B-dimensional tree in bipartiton format).
Output
A pairwise distance Matrix
that can be the input of hc_label
.
Visualize results
PhyloClustering.plot_clusters
— Functionplot_clusters(tree::AbstractMatrix{<:Real}, label::Vector{Int64})
Visualize the result of models.
Arguments
tree
: a B * N tree Matrix (each column of tree Matrix is a B-dimensional tree in bipartiton format).label
: an N-length Vector containing predicted labels for each tree. People can use the output of the models.
Output
A scatter plot showing tree clusters.