Weed and beer, together in parties — and in their DNA!

by Daniela Vergara (@CannaGenomics) and Reilly Capps (@ReillyCapps)

We, at the Cannabis Genomics Research Initiative proudly published our first scientific article, the first one coming out of Professor Nolan Kane’s lab about Cannabis. We researchers at CGRI are genetic detectives exploring Cannabis’ mysteries. A better understanding of Cannabis will lead to more finely-tuned medicine, oil and fibers!

In our paper, we peered into the heart of Cannabis’ solar power organs, the chloroplasts. We also looked into the chloroplasts of beer-making hops, Cannabis’ closest cousin. Using publicly available data on hops, and obtaining our own data on hemp, we used fast and powerful computers to “read” the chloroplasts’ DNA — its internal instruction manual. We were able to report the complete chloroplast genomes for two distinct hemp varieties and one variety of hops. What did we learn? For one thing, we found that the chloroplast genomes of the two different hemp strains were almost identical. And we found that hops differs from the two hemp strains in their chloroplast DNA by only 1.1 percent. This resemblance makes sense; we knew that hops and hemp were each other’s closest extant genera. Cannabis and hops share a common ancestor, like chimps and humans, or the lion and the jaguar. This planty ancestor gave rise to these two genera about 30 million years ago. In simplest terms, our research confirms what many party-goers long suspected: weed and beer go together.

This is important info, beyond party talk. Humans are naturally curious about where the living things around us come from. Our paper can advance the science of evolution, not only in Cannabis, but in all living things. It is one tiny sentence in a great story, which tells us that all living things are related. If you go back far enough, even humans and hemp — different as they are — come from the same ancestor that lived around 1.6 billion years ago.

Science stuff for nerds

This new publication shows the assembled (pieced together) and annotated (determining the position and length of the genes) chloroplast genomes for two hemp Cannabis plants (Carmagnola and Dagestani), and one Hops plant (Humulus lupulus var. Saaze). We downloaded the genomic data for hops from the public, permanent repository NCBI, which was part of the genome assembly for this plant. To obtain the Cannabis genomes was a longer process. We partnered with Centennial Seeds, a Colorado seed company extraordinarily committed to exploring the science of Cannabis. Following our guidance, Centennial Seeds spent thousands of dollars to purchase some of the most advanced scientific equipment on the market: centrifuges, pipettes and DNA extraction kits. Then, we taught them how to extract DNA from a plant. Here’s a video of how it’s done, shot at Centennial Seeds. The extracted DNA was brought to Professor Kane’s lab for sequencing.

What is a chloroplast?

The chloroplast is an organelle (small organ) found in the inside of the plant’s cell. The chloroplast has a critical function, as it is responsible for photosynthesis, which generates energy (food) out of light. Similar to solar panels, the chloroplasts generate energy, in this case in the form of sugar, from the sun! Pretty ingenious plants! The chloroplast has a small genome of its own, and most of the genes in this genome have functions dealing with the photosynthetic process.


Why does the chloroplast have a genome?

This is one of the most wonderful findings in biology, how a bigger cell engulfed a smaller cell (most likely a cyanobacterium) that was able to make light out of food – photosynthesized. This process, called the endosymbiotic theory, explains how the chloroplast in plants, and the mitochondria in most eukaryotes (organisms that have a nucleus) have DNA: they were once living cells that, once engulfed, lost most of their functions provided by the bigger cell but withheld their smaller genome.


Super nerdy stuff we study: the chloroplasts’ genomes

Generally, most chloroplast (and mitochondrial) genomes are circular, as seen in Figure 1. As described in our publication, Cannabis has a chloroplast with 153,871 base pairs (bp)*. Hops’ chloroplast is 153,751 bp long, just a little shorter than Cannabis. In general, chloroplasts have four different regions: the short single copy (SSC), the long single copy (LSC), and two inverted repeats (IR). Both the SSC and the LSC are present once in the chloroplast genome. On the other hand, the IR is found twice: once in one orientation (5 prime to 3 prime -5’ to 3’) and once in the reverse orientation (3’ to 5’). By convention, since the chloroplast genome is circular, it is read counterclockwise starting with LSC, following the IR-A separates the LSC from the SSC, and afterwards IR-B is found at the end, finishing the circle and joining to the LSC (Figure 1).


Figure 1. Graphic representation of the chloroplast genome of the hemp Carmagnola. Each color box represents a gene, and the size of the box represents the size of the gene – some genes have more nucleotides than others. The internal black line circle with the tick marks show the different regions of the genome: the LSC is the longest piece of the genome, followed by the two IRs and finally the shortest part is the SSC.

Both C. sativa chloroplast genomes are very similar between them, they are AT rich (63%) meaning that they have mostly adenine and thymine nucleotides. They differ in 16 SNPs (single nucleotide polymorphisms), or they differ in 16 nucleotides. The LSC of both C. sativa genomes are 84,044 bp, the two IR are 26,007 bp in length, and the SSC is 17,813 bp. On the other hand, the hops genome differs from both C. sativa genomes in 1722 SNPs (approximately 1.1% difference), but also AT rich. The hops chloroplast contains two IRs of 25,953 bp each, separated by a LSC of 84,109bp and a SSC of 17,736bp. Each of the three genomes contain 83 genes.


Why is this important?

The analysis of these data is crucial for scientific advancement, not only in Cannabis, but also, for example, to understand how genomes evolve and how specific genes in genomes change through time. Additionally, the chloroplast is inherited through the mother within the ovule since the males in their sperm usually do not carry these organelles. Thus, the genome of the chloroplast tells us about the maternal lineage of an organism, while the genes in the Y chromosome (in humans and Cannabis) are inherited only through the father. Therefore, when comparing genes that only come from the mother (in the mitochondria or chloroplast) to the genes that only come from the father (in the Y chromosome) we can get different (or similar) stories of dispersion mechanisms of pollen vs. seeds, or where males and females come from.

This work also shows that the two Cannabis chloroplast genomes are very similar and might have changed very little over time. We can tell this because the two different varieties have almost identical chloroplast DNA. Could we assume, then, that, at least in terms of the chloroplast genome, the hemp grown by Jorge Cervantes, by George Washington and by Grok the Neanderthal would have been nearly identical? CGRI is conducting research that would answer this and many more questions.

Please support our investigation through our non-profit organization The agricultural genomics foundation. With your financial support, we will be able to conduct these and many more studies that will help us understand Cannabis.


Figure 2. Blast (basic local alignment search tool) is an algorithm that compares DNA sequence data. This is the blast result of the carmagnola hemp chloroplast against itself, thus both the X and Y axes are the same. We can see a one to one line with a positive slope that represents the LSC and SSC. The lines with the negative slope represents the inverted repeats IR-A and IR-B. The little speckles in the graph represent small repetitive sequences.


*Each of the links that make the DNA strand are nucleotides that are present in pairs – base pairs-, and these are adenine, thymine, cytosine and guanine. These nucleotides are symbolized by the initial letter of their name. The pairs are produced between adenine and thymine, and cytosine and guanine. A segment from a DNA strand looks like this: CTTCTTTCCATCAACAAATGATGCTCA, and these are paired with their complementary nucleotide.


Click here for the Spanish version