Filling the Gaps in the Human Genome

Contrary to popular belief, the human genome was never fully sequenced. The first blueprint, released a few years into the new millennium, left approximately 200 million DNA bases, or 8% of the human genome, undeciphered. So why did scientists stop there?

DNA is made up of units called nucleotide bases that serves as the body’s alphabet, and approximately 3 billion bases make up the human genome. Three billion bases cannot be read from end-to-end. Instead, scientists decipher smaller fragments of DNA and piece the genome back together, like assembling a puzzle. The problem lies in certain stretches of the DNA that are so highly repetitive, that it can be difficult to put them in the right place, like pieces of a puzzle devoid of identifying colours or patterns.

The advent of long-read sequencing allowed scientists to overcome this challenge by reading sequences over 200 times longer than before. Last summer, it was announced that the remaining 8 percent of our genome had been deciphered by the Telomere-2-Telomere Consortium, a feat that took almost twice as long as decoding of the first 92 percent. Last week, the first end-to-end human genome was officially published, cracking over 600 genes previously associated with both health and disease. While the older draft will remain in use, owing to decades of valuable annotations, scientists are excited to delve further into the updated version, which may hold important keys for the future of healthcare, including drug discovery and precision medicine!


Recent Posts