Phylogenetic Tree Construction and Divergence Time Estimation
A phylogenetic tree was built from clusters of gene families for theM. pygmaea and several other species representative species of
two Brassicaceae Lineages (I and II): A. thaliana, A. lyrata, Ae.
arabicum, C. rubella, E. salsugineum, Eutrema yunnanense, L. alabamica,
Raphanus raphanistrum, Sisymbrium irio . Protein sequences from 1,356
single-copy gene families were used for phylogenetic tree construction.
Gene families were constructed using the OrthoMCL v2.0.9 (Li et al.,
2003) method using all-versus-all BLASTP alignments (E-value ≤ 1e−5).
The longest protein encoding sequence at each gene locus for each gene
model was retained to remove redundancy caused by alternative splicing.
MAFFT v7.313 was used to generate sequence alignment for protein
sequences in each gene family using the default parameters (Katoh &
Standley, 2013). Conserved protein sequence alignments were extracted by
Gblocks v0.91b (Castresana, 2000), and the remaining variable protein
alignment regions were used to construct a phylogenetic tree with RAxML
v8.2.11 (Stamatakis, 2014) using the PROTGAMMALGX model. Divergence time
was estimated from the phylogenetic tree using MCMCTree from PAML v4.9
(http://abacus.gene.ucl.ac.uk/software/paml.html). Divergence times were
determined using a Markov chain Monte Carlo analysis run for 10,000
generations, using a burn-in of 1,000 iterations. The calibration time
of divergence was obtained from the TimeTree database (Hedges et al.,
2006) (http://www.timetree.org/).