This article is a continuation of Illumina Data Clustering, and is a perfect example of why we, as scientists, should resist the hubris of premature expectations.
The standard Illumina protocol for library preparation requires 18 cycles of PCR after adaptor ligation to enrich for fragments with doubly modified ends. Incomplete products from a previous round can snap-back during this step (“megaprimer snap-back”), creating artifactual templates which will then amplify along with the others. This is a first order process and should be fairly common when the 3′ of the elongating strand just happens to fall on a complementary REP site. A less common artifact could occur by a second order process involving megaprimer extension and reannealling in trans to a complementary REP site.
I found a group of closely spaced NlaIV restriction sites which would destroy megraprimer formation by the snap-back route when digested. If clusters arose from preexisting TIDs or by the rare megaprimer extension event, NlaIV digestion would have no effect on cluster amplification.
I did the experiment, and found that cutting the template DNA with NlaIV prevented amplification. I am forced to conclude that the beautiful clustering of REP-mediated TID joints found in our data is strictly man made by megaprimer snap-back!