Research

Wave gradient background

Overview

The Genetics of Complex Traits. Many traits of medical, agricultural and evolutionary interest are “complex traits”. Many such complex traits have a sizable genetic component, although in most cases their exact genetic architecture remains a mystery. We use model systems to understand the genetic basis of variation in complex traits using crosses and natural phenotypic and genotypic variation. In recent years we have been especially interested in leveraging Multiparent Populations or Synthetic Populations for this purpose. We routinely work with genome-scale datasets to obtain insights into complex traits including: DNAseq, Poolseq, RNAseq, ATACseq, and scRNAseq. We have found that low-pass sequencing of individuals or pools from synthetic populations in concert with specialized genotype imputation methods can be incredibly powerful.

The Drosophila Synthetic Population Resource (DSPR)

A Drosophila community resource for the genetic dissection of complex traits. In 2012 we created a novel resource for the genetic dissection of complex traits – the DSPR. The resource consisted of two panels of ~750 Recombinant Inbred Lines (RILs). Each RIL is fully genotyped and ultimately derived from a crossing scheme involving 17 highly inbred and characterized “founders”.

Goal

Measuring the RILs allows for the routine dissection of complex traits. As RILs are fully genotyped, an investigator can obtain the RILs, measures each RIL for a trait of interest, and identify QTL without any genotyping. Furthermore since the RILs are derived from 17 founders – as opposed to two parents – more natural variation is sampled (and mappable). Finally, by virtue of 25 generations of random mating being carried out before the RILs were created high resolution mapping of complex traits is enabled.

Impact

Investigators have used the DSPR to dissect dozens of complex traits. Over the last decade dozens of groups have used the DSPR to dissect numerous complex traits. The founder strains are highly characterized with detailed information served up as Santa Cruz Genome Browser tracks. Each founder strain is associated with a reference quality genome assembly (in many cases structural variants may represent causative alleles for complex traits). We have further carried out RNAseq and ATACseq on several adult and/or imaginal disc tissues to aid in candidate gene/allele identification.

Current Directions

Many investigators do not wish to receive or measure phenotypes on 1500 RILs. A weakness of the DSPR (and indeed all collections of RILs or inbred lines) is that the power of an experiment is proportion to the number of RILs measured. For many traits and labs measuring all 1500 RILs is a considerable experimental burden. Recently we have experimented with a novel approach that leverages DSPR resources and the idea of bulked phenotyping and genotyping via an “X-QTL” approach to dissect complex traits. Instead of working with RILs, investigators devise a scheme that allows for the selection of an individuals from the extreme phenotypic tail of an outbred synthetic population (e.g., individuals that survive exposure to Malathion). To map QTL extreme versus control DNA pools from the population are next-gen sequenced and a pipeline developed by our group is used to identify small regions of the genome showing large founder haplotype frequency differences between the pools. For traits amenable to pooled phenotyping this X-QTL approach is extremely cost and time efficient, and results in powerful high resolution mapping of QTL.

The DSPR is derived from a diallel cross between founders, followed by 25 generations of random mating, followed by inbreeding to create 1500 RILs.

The X-QTL approach. Pools consisting of 100s of flies from selected or control samples from a synthetic population are pool sequenced. Haplotypes are imputed and regions of the genome showing the greatest haplotype change identified.

An examination of haplotype frequency change centered on QTL identifies “protective” and “sensitive” haplotypes.

Evolve and Resequence of Yeast Synthetic Populations

An 18-way outbred sexual synthetic yeast population. We crossed 18 haploid founder yeast strains and further intercrossed the resulting population to create an outbred diploid highly recombined base population. We have evolved that base population with occasional recombination in the presence of several stressor as a model of short-term evolutionary change in outbred sexuals.

Goal

A microbial sexual outbred model for Evolve and Resequence experiments has several advantages. A great deal of experimental evolution (with resequencing) has been carried out in E. coli. Despite important discoveries, E. coli lack standing variation and sex, so evolution is driven by new mutations with their fixation probabilities dominated by a phenomena called clonal interference. We do not think this the dominant mode of evolutionary change in outbred sexuals such as humans and virtually all multicellular organisms. Historically D. melanogaster has been the chief model of experimental evolution when standing variation and sex are present, but flies only allow for 20-30 generations of selection per year and it is difficult to maintain minimum population sizes of larger than 1000 individuals. There are limitations to this system. We manipulated S. cerevisiae to create synthetic diploid outbred populations with standing variation via an 18-way cross. This yeast synthetic population has many of the benefits of microbes (easy to handle, large population sizes, glycerol stocks of timepoints, small genomes), while serving as an effective model for evolution in outbred sexuals.

Impact

We have evolved our yeast synthetic populations for several hundred generations in the presence of different stressors. There is still much work to be done. To date we have successfully carried out highly replicated evolve and resequence experiments and have genetically characterized the populations. We confirm and extend many results from Drosophila (especially the observations that much of the genome is evolving either directly or indirectly). We have also obtained novel insights with regards to the starting frequencies of adaptive variants and what evolutionary trajectories look like.

Current Directions

Barcoding is particularly efficient in yeast. We are currently in the process of barcoding hundreds of recombinant clones from our synthetic populations. Tracking barcodes whilst knowing the genome each barcode tags can be a powerful tool for understanding short term evolutionary change in highly replicated experiments.

We created a 18-way synthetic population of yeast via a diallel of well-characterized haploid strains. The resulting population was intercrossed for 12 additional generations.
The sequencing of haploid clones from the 18-way yeast synthetic populations shows incredible haplotype diversity and very short (<100kb) haplotype blocks.

A “Pollock plot” for a single chromosome at four timepoints for the synthetic population evolved to two different stressors (acetic acid & sodium chloride) each replicated two times. Change is pervasive, replicable, and dominated by a single haplotype at any given locus. Haplotypes reach new “equilibrium-like” frequencies very quickly and then remain somewhat static.

The Genetics of Disease Reservoir Competency

Peromyscus leucopus is the primary reservoir for Borrelia burgdorferi – the causative agent of Lyme disease. Lyme disease is the most common vector-borne disease in the United States. It is caused by a bacteria transmitted to humans via a tick bite. Tick generally acquire the bacteria from small rodent hosts. P. leucopus (the white footed deer mouse) seems to be a particularly effective host. In parts of the US where Lyme disease is endemic most such deer mice carry the bacteria. A strange observation is that the deer mice do not seem to get sick and instead tolerate the bacteria. We are leveraging an accidental synthetic population of P. leucopus – a closed colony established in S. Carolina in the 1980s – to unravel the genetics of reservoir/pathogen interactions.

Goal

A closed colony of P. leucopus has similar genetic properties to the widely used mouse “Diversity Outbred” population. We can infect large numbers of colony deer mice with B. burgdorferi and measure various phenotypic responses. From low pass sequencing of each mouse we can accurately impute genotypes at ~14 million SNPs per mouse and associate genetic variation with reservoir phenotypes. This approach is made possible via our chromosome-scale assembled genome for this species (the first for the genus).

Impact

Identifying the genes at which natural variation impacts the ability of a reservoir to tolerate a pathogen may suggest ways to combat Lyme disease. Depending on the nature of variation it is conceivable that gene drives in wild populations could create mouse populations unable to effectively transmit the bacteria to ticks. The genes may further suggest druggable targets in humans, which may allow for better treatments in patients where antibiotics are not effective in eliminating the bacteria. Preliminary results suggest that the immune response of P. leucopus to B. burgdorferi differs in many ways from what the same response looks like in mouse models.

Current Directions

We are employing many modern tools to understand reservoir dynamics. We are infecting hundreds of mice, measuring responses, and genotyping each animal. We are also carefully characterizing several candidate genes that have already emerged from the work. We have knockout cell lines and are attempting Cas9 KOs in the mice themselves. We are also experimenting with scRNA-seq to understand infections at a cellular level. In many cases these are the first times these tools have been used in this genus.

Lyme disease is the most common vector transmitted disease in the US.

Humans and Deer are dead-ends for Borrelia burgdorferi (the causative agent of Lyme disease). The bacteria’s normal life cycle is dominated by ticks feeding on small rodents, of which P. leucopus appears to be the dominant reservoir.

We low pass sequence deer mice from a closed colony that genetically resembles our fly/yeast synthetic populations and impute genotypes at ~14 million SNPs. We carry out GWAS studies using these imputed genotypes.