Genetic diversity within tp0548
While the tp0488 gene in lagomorph infecting TPe C/L
strains shows no defined sequence variability site at a chosen minimum
variant frequency of 0.25, the tp0548 gene in our analysed
samples had two hypervariable regions (V1-2). These regions range from
589,242-287 (V1) and 589,558-647 (V2) on the TP eC strain Cuniculi
A reference genome (CP002103.1; Figure 3A) and were characterised by an
aggregation of polymorphic sites, deletions and repeat-patterns.
Briefly, V1 is characterized through indels and a dominating arginine,
serine and glycine-coding composition. The V2 region is longer and
includes various types of repetitions that are illustrated in Figure 3.
Most strains (n = 203/287) present with type I repetitions that code for
a KGGG amino acid motif. The median number of repetitions of this
dominating type I repeat is three with a range of one to seven (Figure
3C). Besides the 228 strains that showed only one repeat type, 56
strains presented with a mosaic of two or three different repeat types,
and three samples had no repeat sequence at all (Figure S4).
********************
Add Figure 3 about here
********************