2.3 Identification of one-to-one orthologous gene sets
We predicted the coding sequences (CDSs) of each unigene according to Nr
and Swissprot. And we extracted the longest open reading frame (ORF) in
the longest transcript per gene. Estscan 3.0.3
software(Iseli et al., 1999) was also
used to determine the direction of sequences that did not have aligned
results, the CDSs extracted from these unigenes were translated into
amino acid sequences with the standard codon table. We used OrthoMCL
v2.0.3 (Li, 2003) (e-value=1e-3) to
identify orthologous genes using a Markov Cluster algorithm (MCL)
(Enright et al., 2002 JEB). The longest protein sequences per gene were
used as the one-to-one orthologous genes among eight species in this
study and analyzed in downstream analyses.