STRUCTURE DESIGN & COMPUTATIONAL ANALYSIS

OVERVIEW

We aimed to design a 3-dimensional (3D) nanostructure that is capable of self-polymerzing to form larger and more complex structures. We approached this by designing a pair of nanostructures in the shape of hinges that can bind arm to arm to form a chain of hinges referred to as extended hinge structures (EHS)s. After designing these structures in cadnano, simulations were performed to predict their overall 3D shape and stability.

BACKGROUND INFORMATION

DNA Origami

DNA origami forms from one long ‘scaffold’ strand (typically 3000 – 8000 base pairs) which is folded into a specific shape by hundreds of shorter staple strands (typically 24 to 64 base pairs). DNA is held together in a specific shape with crossover events. Crossover events are where a staple strand ‘crosses over’ from being hybridized to one region of the scaffold strand to another region, but the two scaffold regions are not beside one another. For instance, if there is a long DNA strand with regions A, B and C, although regions A and C are not directly beside each other, introducing a staple strand can cause the DNA strand to bend, bringing regions A and C close together in 3D space. This ‘fold’ is the basis of DNA origami formation.

DNA Origami Folding

Figure 1: Formation of DNA Origami Structures

DNA origami design refers to the design process of DNA origami nanostructures, by defining the relative coordinates and hybridization scheme of the staple strands on the scaffold strand. 

The coordinate of a staple strand can be thought of as the location it occupies in a 3D space defined by the design software, relative to all the other strands. The hybridization scheme is the relative location which staple strands are bound to on the scaffold strand. This is the basis of the formation of complex DNA origami nanostructures, which can take the form of a flat 2-dimensional (2D) shape or a complex 3D machine which has a variety of functions in a given system. They are strongly believed to have enormous potential in the advancement of drug delivery systems or biosensing for the detection of certain antigens.

Design Process

The design process is iterative:

    1. Define coordinates and base-pairing scheme of staple strands to scaffold strand
    2. Analysis of design files to discover potentially problematic design flaws
    3. Simulation/prediction of structure formation

Description of Design Software Tools

There are a number of computational tools that were used, all described below.

caDNAno2: 

  • Provides a Graphical User Interface (GUI), (an interface where users can interact with the software using visual components such as a window on a screen), to define the coordinates, connectivity, and binding domains of staple strands and scaffold strand.

Matlab/Python Scripts:

  • Automated analysis of oligonucleotide connectivity and binding domains to ensure proper structure design. Poorly structure designs can cause misfolding resulting in an improper structure geometry. 
  • Scripts are available in Supplementary Material 1 and Supplementary Material 2. 

MrDNA: 

  • Software that simulates/predicts the DNA origami nanostructure shape under Cryogenic Transmission Electron Microscopy (cryo-TEM) using GPU-accelerated computation. 

CanDo: 

  • Web application which outputs images of the structure simulated at 298 Kelvin. Provides images which are colored in, showing the relative stability of those regions.

NUPACK: 

  • Web application to check for secondary structure formation of oligonucleotide sequences, and returns their thermodynamic properties such as free energy, and melting temperature.

Designing the Nanohinge Structure:

A hinge is an example of a simple machine which is fundamental to any complex mechanical system. Here we design a nano-scale hinge which we refer to as the nanohinge. The nanohinge is composed of two ‘bricks’ or bundles of DNA helices, connected at one end creating the pivot point. 

Designing the Locking Mechanism:

To control the movements of the nanohinge, the locking mechanism was introduced to lock the angle of the nanohinge at desired angles. Our current implementation of the locking mechanism is binary, meaning it can keep the nanohinge closed, or open. Our system is also dynamic, and we can control whether the hinge is locked in the open or closed conformation. This is implemented using the toehold mediated strand displacement (TMSD) mechanism, where the introduction of a particular oligonucleotide sequence can cause the nanohinge to lock in the open conformation, or in the closed conformation. 

Fig 2: Locking mechanism implementation on caDNAno2. The highlighetd blue and purple lines represent the padlock strand which is responsible for locking the hinge in the closed or open conformation. The other staple strands are excluded in this image.

TMSD occurs when a DNA double helix is formed between a long DNA strand and short DNA strand. Because of the difference in length, there will be a region on the long DNA strand which is single stranded. The addition of a third ‘invading’ strand, which can form more base pairs with the long strand relative to the shorter strand, will remove the shorter strand from the long strand as it is more thermodynamically favorable to have a double helix between the invading strand and the long DNA strand. The result is a DNA double helix with the invading strand and the long strand, and a single stranded short strand. This is illustrated in the figure below in the locking mechanism schematic:

Figure 3: Locking Mechanism Schematic

Utilizing TMSD allows for the opening and closing processes to be cyclic. This allows us to ‘reuse’ the system multiple times. This is particularly useful in applications such as diagnostic assays, where we can reuse an assay compared to the single-use diagnostic assays such as quantitative PCR (qPCR) or antigen tests which generate a lot of plastic waste.

METHODS

The first step in the design of a DNA origami nanostructure is to decide what the DNA origami nanostructure shape will be. This shape can then be created in the caDNAno2 (2) software. Once the shape is created, the user can decide where to place the staple strands, define the connectivity of the scaffold etc. The caDNAno2 program outputs a Javascript Object Notation (JSON) file which contains information relating to the relative coordinates of the oligonucleotides and their connectivity, and a comma-separated value (CSV) file which contains the sequences for the staple strands. 

To maximize structure formation yield in the lab, further computational analyses were carried out to identify any potential vulnerabilities which may cause improper folding or decrease formation yield. We primarily focused to minimize two types of vulnerabilities which we believed could decrease the formation yield the most: kinetic traps and sandwich strands. 

Kinetic traps are separate crossover events in adjacent DNA helices which are within 5 base pairs of each other longitudinally. Kinetic traps have the potential to cause improperly folded, stable intermediates which must be re-melted prior to proper structure formation. (1) 

Sandwich strands are oligonucleotides which have two long binding domains, or regions of an oligonucleotide which bind to separate regions on the scaffold strand, flanking a shorter binding domain. Sandwich strands can be problematic as the longer binding domains are thermodynamically more favorable to bind, likely binding to their respective domains on the scaffold before the shorter binding domain. Therefore, the shorter binding domain cannot bind since it is unable to coil around the scaffold. 

The draft design files are analyzed by scripts written in Matlab, provided by Dr. Lauback, and Python scripts which were written by the team. These scripts were used to automate examination of JSON and CSV files, hundreds of oligonucleotide sequences, rather than manual inspection which is prone to error and time-consuming. Detected vulnerabilities were removed on the caDNAno2 software if possible. 

Lastly, the designs are verified using simulation or structure prediction software such as CanDo (3) or MrDNA (4). Using these software, we are able to visualize parameters of the structure which we can use to infer the stability, such as root mean square fluctuation (RMSF). RMSF is shown on the structure as color, where a particular color represents that region fluctuating more in 3D space relative to other regions of the structure. In other words, regions shown in a particular color, such as blue, are more unstable and fluctuate more in space relative to other regions. 

Figure 4A: Side view of H1 structure simulated on MrDNA visualized on VMD. Figure 4B: Side view of H2 structure simulated on MrDNA visualized on VMD. Higher RMSF is represented in blue, and low RMSF is represented in red. White represent intermediate RMSF values.

Regions at the ends of the hinge are expected to show higher RMSF as they move a larger distance relative to regions closer to the hinge pivot point with the same change in angle. Regions which showed high RMSF where they were not expected were modified in caDNAno2 to make that region more stable. For instance, we could alter the staple strand connectivity in that region by introducing more crossover events. 

Locking Mechanism Design

The sequences of the oligonucleotides were defined on caDNAno2 using the sequence of the p8064 vector. Other ‘special’ strands whose sequence could not be determined by using the caDNAno2 program, namely those involved in the locking mechanism, were manually determined to avoid the possibility of undesirable secondary structure formation. Simulation of these sequences, and the melting temperature of such secondary structures were determined using the web application NUPACK (5).

RESULTS AND DISCUSSION

Modelling Results: 

Structure predictions were primarily done on MrDNA due to the increased amount of control we had with the simulation. For example, we could control the number of steps, or iterations, that the simulation computes, extract quantitative data on stability, and visualize that data interactively on the visualization software VMD (4)  as a qualitative check. 
The simulation was run with the default number of steps at 10 million. We can extract quantitative data, namely the root mean square deviation (RMSD) values, to verify that the simulation has run to completion and has predicted an accurate representation of the structure that will be seen when it is visualized on cryo-TEM. The RMSD parameter shows how far, on average, the molecules of the structure have deviated from its original position. Figure 5 shows the change in RMSD of the H2 structure. The RMSD reaches a max value of approximately 45Å at simulation step 560, then plateaus. This indicates that the structure has stopped changing, and the structure shown from that point on is likely to be what is seen under cryo-TEM imaging.

Figure 5: Root Mean Square Deviation (RMSD) Plot of H2. The simulation has run a sufficient number of steps once the RMSD Value (angstroms) plateaus at a value. For time efficiency, future simulations of the H2 structure can be stopped at roughly 5.6 million steps rather than the default 10 million.

For the final designs of H1 and H2, qualitative RMSF visualization on VMD shows very little fluctuations in the interior of the hinge components (shown as red), and higher RMSF (in blue) at the ends of the hinge, which is expected.

Video 1: Short clip of the H1 structure simulation visualized using VMD. Displays the predicted structure visible under cryo-TEM.

The ends of the hinge display high RMSF as seen in the clip of the simulation above. This is because the hinge ends are single stranded to avoid random base-stacking interactions with other hinges, and to accommodate the addition of polymerization strands which we use to polymerize hinges together, forming the extended hinge system (EHS), in a controlled manner.

Supplementary Information

To download H1 and H2 design files (with and without locking mechanism), click the link below. 

REFERENCES
  1. Castro, C. E., Kilchherr, F., Kim, D. N., Shiao, E. L., Wauer, T., Wortmann, P., Bathe, M., & Dietz, H. (2011, February 25). A primer to scaffolded DNA origami. Nature Methods, 8(3), 221–229. https://doi.org/10.1038/nmeth.1570
  2. Douglas, S. M., Marblestone, A. H., Teerapittayanon, S., Vazquez, A., Church, G. M., & Shih, W. M. (2009, June 16). Rapid prototyping of 3D DNA-origami shapes with caDNAno. Nucleic Acids Research, 37(15), 5001–5006. https://doi.org/10.1093/nar/gkp436
  3. Kim, D. N., Kilchherr, F., Dietz, H., & Bathe, M. (2011, December 10). Quantitative prediction of 3D solution shape and flexibility of nucleic acid nanostructures. Nucleic Acids Research, 40(7), 2862–2868. https://doi.org/10.1093/nar/gkr1173
  4. Maffeo, C., & Aksimentiev, A. (2020, March 31). MrDNA: a multi-resolution model for predicting the structure and dynamics of DNA systems. Nucleic Acids Research, 48(9), 5135–5146. https://doi.org/10.1093/nar/gkaa200
  5. Zadeh, J. N., Steenberg, C. D., Bois, J. S., Wolfe, B. R., Pierce, M. B., Khan, A. R., Dirks, R. M., & Pierce, N. A. (2010, November 17). NUPACK: Analysis and design of nucleic acid systems. Journal of Computational Chemistry, 32(1), 170–173. https://doi.org/10.1002/jcc.21596