Species tree via supertree

In the original paper, the authors used 11,033 gene trees, but we only have 8739. We have all the gene trees estimated by RAxML in 04-all_gene_trees.tre.

We had already installed SuperTriplets_v1.1.jar and you need to know the path where it is. Let’s say that the path is ~/software/, so we would run the command like:

We need to be in the results

java -jar ~/software/SuperTriplets_v1.1.jar 04-all_gene_trees.tre 07-supertree.tre

Now we can visualize it in R:

library(ape)
tre = read.tree(file="07-supertree.tre")
tre = tre[[2]] ## tree stored in second entry
plot(tre)

rtre = root(tre,outgroup="H_vulgare_HVens23", resolve.root=TRUE)
plot(rtre)

We can compare this tree with the more simple parsimony-based supertree that we constructed in 04-gene-trees:

gene_trees <- read.tree("04-all_gene_trees.tre")
st_parsimony<-superTree(gene_trees)
st_parsimony<-root(st,"H_vulgare_HVens23",resolve.root = T)
plot(st_parsimony)

Species tree via coalescent models

At the individual level

We will use ASTRAL4 to reconstruct the species tree under the coalescent model. The input is the list of RAxML estimated gene trees with branch lengths: 04-all_gene_trees.tre.

We have to be in the results path and run:

astral4 -i 04-all_gene_trees.tre -o 07-individual-species-tree-astral4.tre

Now we can visualize it in R (in results):

library(ape)
tre = read.tree(file="07-individual-species-tree-astral4.tre")
plot(tre)
rtre = root(tre,outgroup="H_vulgare_HVens23", resolve.root=TRUE)
plot(rtre)

Note that we tried to run Weighted ASTRAL but the results were not good and we couldn’t understand why. Those interested could run the command and compare the trees:

wastral -i 04-all_gene_trees.tre -o 07-individual-species-tree.tre

We will see that the individuals of the same species are not even monophyletic.

At the species level

Since we have multiple individuals per species for some species, we can use the multispecies coalescent model to infer a species level phylogeny. However, we first need to create a mapping from each individual ID to a species ID.

We will create this mapping in R (inside code):

##First we get all the individual names
genes_dir <- "../data/Wheat_Relative_History_Data_Glemin_et_al/OneCopyGenes/"
gene_files<-paste(genes_dir,list.files(genes_dir,pattern='\\.aln$'),sep='')

all_individuals <- character()
i<-1
for(f in gene_files){
  headers <- rownames(read.dna(f, format = "fasta"))
  all_individuals <- unique(c(all_individuals, headers))
}
all_individuals <- sort(all_individuals) # Sort alphabetically for consistency
all_individuals

Here we can see that each individual ID is just the species ID with a unique identifier at the end (e.g., “SpeciesID_IndividualID”). Thus if we remove the tag at the end we have a map from individual to species:

cleaned_individuals <- sub("_[^_]+$", "", all_individuals)

#map all individuals to species
mapping <- paste(all_individuals,cleaned_individuals)
writeLines(mapping, "../results/07-species_mapping.txt") ## write to file

Now our mapping is saved as 07-species_mapping.txt

To make a species-level phylogeny, we just need to specify our mapping file in Weighted ASTRAL with the -a flag.

We have to be in the results path and run:

astral4 -i 04-all_gene_trees.tre -a 07-species_mapping.txt -o 07-species-tree-astral4.tre

Now we want to plot it in R (in results):

library(ape)
tre = read.tree(file="07-species-tree-astral4.tre")
plot(tre)
rtre = root(tre,outgroup="H_vulgare", resolve.root=TRUE)
plot(rtre)

Again, the results from wASTRAL were not good, placing the outgroups within the ingroup clade. Those interested could run:

wastral -i 04-all_gene_trees.tre -a 07-species_mapping.txt -o 07-species-tree.tre

and compare the trees.


Copyright Solis-Lemus lab © 2025. Distributed by an MIT license.

This site uses Just the Docs, a documentation theme for Jekyll.