Date of Award

December 2015

Degree Type


Embargo Date


Degree Name

Doctor of Philosophy (PhD)




Philip N. Borer


aptamer, high throughput sequencing, SELEX, thrombin binding aptamer

Subject Categories

Physical Sciences and Mathematics


The minimal human alpha-thrombin binding aptamer, d(GGTTGGTGTGGTTGG), has previously been successfully identified from a DNA library of randomized hairpin loop, m=15, with the High Throughput Screening of Aptamers (HTSA) technique, later termed Acyclic Identification of Aptamers (AIA). AIA eliminated the need for multiple cycles of in vitro evolution typically used for aptamer discovery by employing libraries with an over-representation of all possible sequences and high throughput sequencing. Although the method was successful at identifying the thrombin binding aptamer, improvements to partitioning and sample preparation inconsistencies encountered during replication attempts were necessary in order to maximize its value. Subsequent revisions and variations of the protocol improved and streamlined the work-flow such that the thrombin binding aptamer (TBA) and multiple variants were identified as high affinity sequences using the ligation based m=15 DNA hairpin loop library mentioned above. The revised AIA protocol was also successful for identifying TBA and multiple variants in a nuclease resistant, modified 2’-OMe RNA/DNA chimera library. Sample throughput was increased with the introduction of indexed adapters containing sequence “barcodes” that facilitate multiplexing during high throughput sequencing. To eliminate the constraints of the hairpin loop library structure, a library based on direct amplification, the “adapter” library was designed. The m=15 DNA and m=15 2’-OMe RNA/DNA chimera adapter libraries were unsuccessful at identifying TBA from the library pool, likely due to interference from the flanking adapter regions required for PCR amplification. A secondary PCR product identified during sample work-up was also characterized. This prompted a significant interest in creating the ability to accurately assess whether or not a sample should be sequenced prior to consuming valuable resources. To accomplish this, the capture of a minimally flanked library (pACAC-m15-CACA) with full length adapters and no requirement for amplification was optimized. The library provides greater flexibility in secondary structure formation and was shown to successfully identify TBA from an over-represented library pool. Amplification-free AIA introduced the unique ability to predict relative maximum sequence frequencies based on the quantity of recovered library, initial degree of over-representation, and anticipated data output. The ability to predict whether or not a sequence could be counted above background was used to assess whether that sample would be sequenced, ultimately saving time and money. To expand the applicability of the tailed libraries and amplification free protocol, a novel partitioning method that eliminated the requirement for protein immobilization was developed. Reversible formaldehyde cross-linking in conjunction with Electrophoretic Mobility Shift Assay was used to successfully identify TBA above background in proof of concept experiments using amplification free sample preparation. The ability to perform AIA partitioning in solution provides greater flexibility in target selection, including mixtures of proteins. The method also effectively reduces the aptamer off-rate to zero by covalently linking high and moderate affinity sequences to the protein target during selection, an advantage over protein immobilization for partitioning where loss of some DNA from a reversibly bound complex is inevitable. The sample preparation techniques that evolved over the course of this work offer superior control and predictability in the outcome of high throughput sequencing data. This was aided by absolute quantification with qPCR CopyCountTM software that effectively improved library quantification, distribution of indices, and cluster quality, which is crucial for maximization of data output on Illumina sequencing platforms. Consistent, high quality data eliminates the potential for costly resequencing. Future experiments will capitalize on the breadth of improvements to the AIA method described in this work.


Open Access