Posts Tagged ‘amplification’

Amplification Levels & Copy Number from Solexa

Wednesday, July 15th, 2009

We can calculate levels of amplification as well as plasmid ploidy in a straightforward fashion from Solexa data. Consider our analysis of TT25790, which contains Elisabeth’s array EK568, fully characterized at both join points.

When we look at the “read density” map for reads crossing plasmid F’128(FC40) in this strain, we see that the frequency of reads increases abruptly at reference position ~132098 and returns to baseline abruptly at ~158297. If we go to the raw read data, we find 504973 reads in this interval of 26200 bp, for an average read density of 504973/26200 = 19.3 reads/nucleotide.

We can calculate in a similar manner the read density for the remaining, unamplified region of the plasmid, a circle of size 231427 bp. The total reads for the entire plasmid are 1569700, and for the unamplified region, 1569700-504973 = 1064727. Thus, the average unamplified read density will be 1064727/(231427-26200) = 5.2 reads/nucleotide.

Thus, the EK568 array is amplified with respect to the remainder of the plasmid by a factor of 19.3/5.2 = 3.7. This seems unusually low, given the fact that the samples were grown in minimal lactose medium. Even though the strain is rec, we expect this number to differ from preparation to preparation, as rec-independent recombination by mechanisms such as annealing, snap-back extension, and strand switching appears to be fairly frequent in F plasmid derivatives, perhaps the consequence of continuous rolling circle generation of long single-stranded DNA ends.

Returning to the raw data, we can ask what the read density across the 4068 bp lacIZ fusion gene itself is. We find 88003 reads across the gene, for a density of 88003/4068 = 21.6, which yields an amplification level of 21.6/5.2 = 4.2, still lower than expected.

Now, let’s look at the copy number of the plasmid itself. The chromosome contains 4857432 bp, across which we gathered 15716836 reads for an average density of 15716836/4857432 = 3.2 reads/nucleotide. We know that the unamplified region of the plasmid has a density of 5.2. Therefore, the ploidy of this plasmid with respect to the chromosome is 5.2/3.2 = 1.6, a trifle smaller than our working estimate of 2. Bear in mind, though, that this sample came from an overnight culture in stationary state. Under these conditions, we expect the copy number of F to be at its lowest.

We can use the read densities of the lacIZ gene and the chromosome to determine the copy number of the fusion per chromosome, equal to 21.5/3.2 = 6.8. If the activity of the mutant gene is 2% of wild-type and we assume strict additivity of gene expression, we calculate about 2×6.8 = 13.6 % final wild-type activity.

This may be enough to allow significant growth, but why wasn’t a greater growth rate selected by simple continued amplification? The answer may simply be that rec-independent amplification is slow compared to the relatively brief time to grow to stationary state with an already appreciable amount of lac activity.

Finally, why is the amplification level of lacIZ greater than the average amplification level of the array itself? The answer lies in the word “average”. Remember that this is a TID array in which the elements can be of different sizes. If many of the elements containing lacIZ are smaller than average, the actual level of the gene would be correspondingly greater, as observed.

Many thanks to Yong Lu for helping me collate this data!

-- Eric Kofoid

First Solexa Data In!

Wednesday, April 1st, 2009

(For those of you who may have forgotten, Solexa sequencing is a rapid, highly automated method of generating millions of short sequences at random across a DNA sample — often, an entire genome).

I have just received the first set of Solexa data from our collaboration with Fritz Roth and his colleagues, Yong Lu & Joe Mellor. The image below shows “read depth” (the number of runs which cross a given point) in the neighborhood of lacZ for strains TT24815 and TT25790. We expect this measurement to increase in proportion to the degree of amplification. Coverage over non-amplified areas of the plasmid and chromosome exceeded 50-fold for both strains.

Small red arrows show my guess at the amplification endpoints. The TT24815 array stretches from approximately 138256 to 166250 (~28 KB), and TT25790,  from 131600 to 159300 (~28 KB), where the coordinates refer to our standard F’128 sequence counting clockwise from the first nucleotide of IS3A.

Strain TT25790 contains Elisabeth’s known inversion duplication array (EK568), for which we have sequenced a single join point (134075->134087 recombined into 132108<-132098). Small blue arrows show these two tracts. In our simple models of inverted duplication formation, join 1 forms from their recombination, either directly (“Flying Walendas”) or by assymetric deletions of a larger toxic structure (“Slytherin”). Furthermore, all Solexa data in the array should begin at the leftmost blue arrow, gratifyingly close to my guessed endpoint. Join point 2 will be defined by a sequence near the righthand red arrow and its inverted complement at a position yet to be found in the amplified region. I shall go hunting!

Strain TT24815 was chosen for its recalcitrant nature — we were never able to find any join points, but assumed for this reason that it was a likely candidate for an amplified inverted duplication, as crossover sites in these entities are truly difficult to locate and sequence. We were hoping to get two new bits of previously unknown information out of it. Once again, half of each of the join points should be defined by small sequence inversions in the neighborhood of the red arrows, assuming that this is a truly simple array of elements representing one kind of inverted duplication. More hunting!

If you use your browser’s zoom feature, you can inspect the image with better resolution. You can also download a detailed PDF file.


-- Eric Kofoid

Why is natural selection hard to beat and when do you need to beat it?

Monday, March 9th, 2009

[This is a stub entry I'm making for John under his name. He should re-edit it with his own words. -- Eric]

Here’s a brief review I just wrote with Dan.

Why is natural selection hard to beat and when do you need to beat it?
John R. Roth and Dan I. Andersson

Bacterial genetics defeats natural selection — it uses positive selection to detect large-phenotype mutants without influencing their frequency.  Metazoans maintain organism integrity by defeating natural selection on somatic cell growth.  Bacterial genetics relies on selection strong enough to prevent growth of both the parent and common slightly-improved mutants.  When selective stringency is reduced, frequent small-effect mutations allow growth and initiate a cascade of successive improvements.  This rapid response rests on the unexpectedly high formation rate of small-effect mutations (particularly duplications and amplifications). Duplications form at a rate 104 times that of null mutations.  The high frequency of small-effect mutations reflects features of replication, repair and coding that minimize the costs of mutation.
The striking effect of small-effect mutations is seen in a system designed by John Cairns to test the effect of growth limitation on mutation rate.  A leaky E. coli mutant (lac) is plated on lactose medium.  Revertant (Lac+) colonies appear over 6 days above a lawn of (108) non-growing parent cells. These colonies have been attributed to stress-induced mutagenesis of the non-growing parent. This conclusion ignores natural selection, assuming that only large-effect mutants appear– as is true for lab genetic selections.  However, selection is not stringent in the Cairns system — small increases in lac enzymes allow growth.  Common cells with a lac duplication (and 2x the mutant enzyme level) initiate slow-growing colonies, in which selection drives a multi-step adaptation process – higher amplification, reversion to lac+ and loss of mutant lac alleles.  The high yield of revertant colonies under selection does not reflect mutagenesis, but rather the high spontaneous rate of gene duplication (10-5), amplification (10-2/step) and the selective addition of mutation targets (more cells with more mutant lac copies/cell).
Metazoan somatic cells may escape natural selection by the same mechanism.  Metazoans reduce the basal level of unexpressed genes 1000-fold (compared to bacteria) by their epi-genetic modification of DNA and histones – making it impossible for small-effect mutations to provide growth.

-- John Roth

The origin of mutants under selection: Interactions of mutation, growth and selection

Monday, March 9th, 2009

[This is a stub entry I'm making for John under his name. He should re-edit it with his own words. -- Eric]

Here’s the abstract to a new article.

The origin of mutants under selection: Interactions of mutation, growth and selection

Dan I Andersson, Diarmaid Hughes and John R Roth

In microbial genetics, positive selection detects rare cells with an altered growth phenotype (mutants or recombinants).  The frequency of mutants signals the rate of mutant formation – an increased frequency suggests a higher mutation rate.  Increases in mutant frequency are never attributed to growth under selection.  The converse is true in natural populations, where changes in phenotype frequency reflect selection, genetic drift or founder effects, but never changes in mutation rate.   The apparent conflict is resolved because restrictive rules allow laboratory selection to detect mutants without influencing their frequency.  With these rules, mutant frequency can reliably reflect mutation rates. When the rules are not followed, selection rather that mutation rate dictates mutant frequency – as in natural populations.  In several laboratory genetic systems, non-growing stressed populations show an increase in mutant frequency that has been attributed to stress-induced mutagenesis (adaptive mutation).  Since the mutant frequency is used to infer mutation rate (standard lab practice), the rules must be obeyed.  A breakdown of the rules in these systems may have allowed selection to cause frequency increases that were attributed to mutagenesis.  These systems have sparked interest in interactions between mutation and selection. This has led to a better understanding of how mutants arise, and how very frequent, small-effect mutations, such as duplications and amplifications, can contribute to mutant appearance by increasing gene dosage and mutational target size.

-- John Roth

Amplification & Adaptive Mutagenesis

Thursday, March 5th, 2009

[This is just a teaser to get us started -- add to it, change it, throw it away, but please leave something worthwhile behind!]

John Cairn’s once observed that apparently non-growing populations of bacteria spontaneously acquire mutations which enable them to grow on a previously unutilizable carbon substrate.

An early explanation bordered on the metaphysical, invoking an awareness by the non-growing cell of the tantilizing substrate, lactose, and a consequent mutational targeting of a specific gene, lacZ, which, when appropriately modified, would allow growth on this compound.

Many requirements and predictions of this original model were quickly shown to be wrong. DNA replication occured in the “quiescent” cells, which were also growing, albeit at a slow rate. Mutations were not confined to the “targeted” gene alone. Adaptation to growth on lactose would not occur if the gene were on the chromosome; the observed reversion to lactose utilization was only seen when lacZ was on a specialized F plasmid. Additionally, the effect was found only if this plasmid expressed a suite of functions which enabled plasmid replication by rolling-circle synthesis of single-stranded DNA and resulting transfer of this DNA to recipient cells.

Cairn’s descendents have bifurcated two basic models from the original, although a number of others have been left lying in the dust over the years. Both assume that an evolved mechanism senses stress (i.e.,starvation) and directs an increase in mutagenesis. Pat Foster’s model asserts that stress induces rpoS, which in turn makes recombination mutagenic. Susan Rosenberg’s model maintains that a general hyper-mutagenic state is evoked which is independent of rec functions.

We point out that a third model exists which makes no assumption of any evolved stress-sensitive mutagenic mechanism, but instead relies on the bag of genetic tricks described and well verified over the last century. We note that Cairn’s cells are growing slowly, and are replicating their DNA. Duplications in DNA are relatively common and can be amplified during replication. A defective gene which nevertheless sustains slow growth allows an increase in the basal growth rate when duplicated. Selection for faster growth will favor cells containing higher order amplifications of the defective gene. Such cells will sweep the population. The opportunity for true reversion will be roughly the number of such cells times the average amplification factor times the rate of reverting a single gene per generation. Because the amplification factor is under selection and expected to grow with the number of generations,  the probability of true reversion to lac+ will increase substantially over the course of the experiment, accounting for all colonies observed.

We like this hypothesis because it involves no new technology, no magic, and no religion. We have a substantial amount of data supporting it. The requirement that the defective locus be on an F derivative is easily understood by two facts: One, the relatively small region of interest is flanked by exact copies of the insertion element, IS3, which allows easy initial duplication of the region. Two, F is constantly replicating itself by a rolling-circle mechanism generating long single strands. These can induce rearrangements by recombination or annealing. Under selection, this can lead to rapid increase in the degree of amplification, and promote remodeling leading to diminished size of the amplified element, thus minimizing the cost of ancillary gene dosage effects.

-- Eric Kofoid