Skip to content

Layman's introduction to CRISPR, 2021 August 10

HC Lee

2021 August 10

For their discovery of the "CRISPR/Cas9 genetic scissors", Emmanuelle Charpentier and Jennifer A. Doudna [1] were awarded the 2020 Nobel Chemistry Prize. Here's what I think are the key points in layman's terms.


- A DNA sequence, or sequence for short, is a linear string of nucleotides, of which there are four types, identified by their bases, A, C, G, T. Hence a DNA sequence is usually identi?ed by a linear string of bases. For instance, TATA is a sequence of length 4, GCAGTTAGAA is a sequence of length 10.

- Complement. The bases A and T are mutually complementary, as are C and G. Given two sequences S1 and S2, if the two bases at the nth position of the two sequences are complementary, for all positions, then the two sequences are said to complements of each other. The two sequences AAGCT and TTCGA are complementary, as are TTCAGCAGG and AAGTCGTCC.

- Double helix. DNA is double stranded; the bases at respective nth sites of the two strands are "paired", that is, they are complements and chemically bonded. One can picture a double-stranded as a ladder, with the side rails being the two strands, the connecting points of a rung at the rails representing two bases, and the rung symbolizing the chemical bond between the bases. Because the ladder is not straight, but is twisted with a fixed twisting angle. Watson and Crick, who discovered this structure [2], called it a Double Helix.

- A palindromic sequence is a sequence that is complementary with its own reverse sequence (or simply, reverse). For instance, TCCGCGGA is palindromic, because its reverse is AGGCGCCT, which is the complement of the first sequence.

In the following I will explain the three terms unfamiliar to the non-experts: CRISPR, Cas9, and genetic scissors separately.

Genetic scissors

Genetic scissors are molecular systems capable of cutting DNA sequences.


CRISPR, or clustered regularly interspaced short palindromic repeats, is a (single strand) DNA sequence with a specific structure: it is composed of a same sequence of length about 30, repeated a number of times, interspersed with different sequences, called spacers, of various length. The repeated sequence is approximately palindromic. The spacers are typically DNA sequences from a foreign species. In bacteria where CRISPR were first discovered, the spacers were DNA sequences unique to viruses harmful to the host bacteria (Figure 1).

Figure 1. CRISPR - clustered regularly interspaced short palindromic repeats


Cas9, or CRISPR associated protein 9, is a protein that, combined with a CRISPR, form a genetic cutting tool capable of cutting a DNA sequence at precise locations determined by the spacers in CRISPR.

Figure 2. A CRISPR-Cas system


There are now Cas proteins other than Cas9 forming CRISPR systems that function differently from CRISPR/Cas9 (often CRISPR-Cas9 system in literature) (Figure 2; it's alright if you don't understand everything in the Figure)). I think (not absolutely sure) Charpentier and Doudna won the prize because they were the first to discover a good CRISPR-Cas system and demonstrate its usefulness.

To sum, the discovery of CRISPR-Cas9 system led (there were many other pioneers) to the development of power genome editing systems now widely applied to biological and medical R&D.

Read on if you have time to spare.

Some background. CRISPR-Cas systems are part of the natural bacterial anti-viral immune system. Bacteria simply use it to kill invading viruses by cutting its genome. After humans understood what it was for (took them a few years) they copied the bacterial systems and modi?ed it with novel Cas proteins and spacers for new purposes.

Some Q & A's

Q. What is new about CRISPR-Cas systems as DNA cutting tools?

A. Restriction enzymes (a type of proteins) that exist naturally in all species have been known for a long time to have DNA cutting functions. However, each restriction enzyme cuts a site (or sites) specific to the enzyme. In contrast, by selecting the right spacer CRISPR can be made to cut at a precise, user selected site.

Q. So we can CUT a DNA precisely at the site we want to, then what? That's not exactly genome editing.

A. Scientist have long possessed sophisticated techniques to INSERT a sequence (or "deliver genetic material") into human (or other species) DNA , as in gene therapy and genetic engineering. A well-used method is to use the ability of virus to invade the human (or other hosts) genome by implanting the sequence into the DNA of a virus and then infect the host with the virus. In such cases the user has no control of the site of insertion in the host. A CRISPR-Cas system now allows the user precise control of the site of insertion. Combined with already existing sequence insertion techniques, with CRISPR-Cas systems scientists now have a complete set of tools that allows them to precisely edit a DNA.

Q. Are the double strandedness of DNA and the fact that DNA is built of two pairs of complementary bases important?

A. These properties are keys to DNA's success as the fundamental information storage of life. If DNA were singly stranded, accurate and rapid duplication would be would involve a far more complex and difficult protocol than what we have now, and an error rate of one in a billion we now have might be impossible to achieve.

Q. In CRISPR-Cas systems, what are the spacers for?

The spacers are what give the system high precision in the cutting site. Using the property (derived from base complementarity) that a sequence (call it S) would bind to a sequence (call it T), the target DNA, that is the reverse complement of S, the user can precisely control the site on the target DNA where the CRISPR-Cas system will act by judiciously selecting the spacer (the sequence S) in the system. If the user wants the system to act at one and only one site, then the spacer must be a unique sequence on the DNA. I suppose this requirement would place some limit on the utility of a CRISPR-Cas system.

Q. How does the spacer (S) find its target (T)?

The human DNA has three billion nucleotides that are spread more or less randomly in a cell that is filled with millions of other molecules. For a single spacer in a CRISPR-Cas system, finding its target in a cell is far more difficult than finding a needle in a haystack. In practice what is sent into the cell is a solution with a known density of CRISPR-Cas systems. If one million systems enter the cell then the chances that one system finding its target is increased by one million fold. Experiments need to be done to determine the optimum dosage (of the system-carrying solution) to be used for the task at hand. The same principle is used in many molecular-based research as well as the administer of medicine to patients.


Feng Zhang 張鋒 (b. 1981) of MIT was also a prominent pioneer of CRISPR-Cas technology. His key CRISPR-Cas9 paper was published in January, 2013 [3]; Charpentier and Doudna's paper, August 2012 [1]. Many believe Zhang's contribution to be comparable to that of Charpentier and Doudna. Zhang has been much more prolific than Charpentier and Doudna. He has received many awards and honors, including sharing two prestigious ones with Charpentier and Doudna. Perhaps he would have had a better chance were he not born in China (emigrated to US at age 11).


[1] Jinek et al. (2012). "A programmable dual RNA-guided DNA endonuclease in adaptive bacterial immunity". Science. 2012 Aug 17; 337(6096):816-21.

[2] J. D. Watson and F. H. C. Crick (1953). "Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid". Nature. 1953; 171:737-738.

[3] Cong et al. (2012). "Multiplex Genome Engineering Using CRISPR/Cas Systems". Science. 2013 February 15; 339(6121):819-823.


© HC Lee, August 10, 2021, Taoyuan, Taiwan