Selection or drift: The population biology underlying transposon insertion sequencing experiments
Transposon insertion sequencing methods such as Tn-seq revolutionized microbiology by allowing the identification of genomic loci that are critical for viability in a specific environment on a genome-wide scale. While powerful, transposon insertion sequencing suffers from limited reproducibility when different analysis methods are compared. From the perspective of population biology, this may be explained by changes in mutant frequency due to chance (drift) rather than differential fitness (selection).
Here, we develop a mathematical model of the population biology of transposon insertion sequencing experiments, i.e. the changes in size and composition of the transposon-mutagenized population during the experiment. We use this model to investigate mutagenesis, the growth of the mutant library, and its passage through bottlenecks. Specifically, we study how these processes can lead to extinction of individual mutants depending on their fitness and the distribution of fitness effects (DFE) of the entire mutant population.
We find that in typical in vitro experiments few mutants with high fitness go extinct. However, bottlenecks of a size that is common in animal infection models lead to so much random extinction that a large number of viable mutants would be misclassified. While mutants with low fitness are more likely to be lost during the experiment, mutants with intermediate fitness are expected to be much more abundant and can constitute a large proportion of detected hits, i.e. false positives. Thus, incorporating the DFEs of randomly generated mutations in the analysis may improve the reproducibility of transposon insertion experiments, especially when strong bottlenecks are encountered.