What is the human genome and how exactly do you sequence it? Everyday Einstein parses the chromosomes to look inside DNA.
A few years ago, scientists made headlines by sequencing the human genome. News agencies reported that scientists had "unraveled the blueprint of life," and other sensationalized headlines. But what is a genome and what does it mean to sequence it? Let’s look beyond the hype.
Birds, Bees, Flowers, and Trees
The first question is actually the trickiest to answer. Technically the "genome" of an organism is the total of all hereditary information of that organism. As most of you probably know, inside each of our cells is a set of chromosomes, made of a kind of molecule called DNA. Most of our DNA is stored within the nucleus of each of our cells, but,some DNA can also be stored in our mitochondria, small cellular components that make energy.
Different organisms have different numbers of chromosomes. For example, most humans have 2 sets of 23 chromosomes, one set comes from the father, the other from the mother. Strawberries are interesting in that while they have 7 chromosomes, they can have either 2, 4, 6, 8, or even 12 sets of those 7, depending on the exact species. The common garden strawberry has 8 sets of chromosomes. Female honeybees have 2 sets of 16 chromosomes, while males have only 1 set of 16.
Broadly speaking, there are two types of chromosomes. Autosomes and sex chromosomes (sometimes called allosomes). Most humans have two sets of 22 autosomes, (one each from their mother and father), as well as 2 sex chromosomes. The set of sex chromosomes you have determine your gender. Human males typically have an X and Y sex chromosome (the X comes from their mother and the Y from their father), while females have 2 X chromosomes (one each from mother and father).
While many organisms follow the same XY system for sex determination as humans, not all do. For example, female cockroaches have two X chromosomes, while males have only one. No Y chromosomes are involved. Male birds on the other hand, have two Z chromosomes, while female birds have a Z and a W chromosome.
Typically when scientists talk about sequencing a genome, they mean they have sequenced one copy of each autosome and one copy of each sex chromosome. So sequencing the human genome would mean sequencing one copy each of the 22 autosomes (the chromosomes that are the same whether you’re male or female) and one copy each of the X and Y sex chromosomes.
DNA is a long molecule that is sort of like a long rope ladder. Each side of the ladder is made up of 4 different types of pieces called nucleotides. The four nucleotides are Guanine, Adenine, Thiamine, and Cytosine, but that can get a bit wordy so we usually just abbreviate them as G, A, T, and C. The rungs of the DNA ladder are formed when associated bases on each strand stick together. Guanine binds to Cytosine and Adenine binds to Thiamine. Each of these ladder rungs is called a "base pair" because they are made of a pair of nucleotide bases.
There are lots and lots of these base pairs in a chromosome. In the largest human chromosome, chromosome 1, there are around 250 million of these nucleotides. If you combine the base pairs found in the human genome (one copy of each autosomes and one copy of each sex chromosome) you would have around 3.2 billion base pairs.
This means that given the fact that the largest Harry Potter book, Order of the Phoenix, has about 1.6 million letters in it, you would need 200 copies of that book to fit the entire human genome sequence inside. That would be a stack of Harry Potter books about three stories tall. That's a lot of Harry Potter.
The World's Largest Jigsaw Puzzle
Unfortunately the technology to extract and read these long strands of DNA in their entirety doesn't exist yet. Instead, scientists have to take the chromosomes and chop them up. Then they read each piece and try to put them all back together again. The process for doing this is a bit complicated, but in the end we end up with the entire genetic sequence. Unfortunately the genetic sequence on its own doesn't really tell us anything. Eric Lander, one of the scientists involved in the Human Genome Project said it best:
"If you take an airplane, a Boeing 777, I think it has 100,000 parts. If I gave you a parts list for the Boeing 777, in one sense you'd know a lot. You'd know 100,000 components that have got to be there, screws and wires and the rudders and things like that. On the other hand, I bet you wouldn't know how to put it together. And I bet you wouldn't know why it flies."
So now you know a little more about just what is meant by sequencing a genome. So what good is this if the sequence doesn't tell us anything? By studying the sequence and the proteins that come from it, we've already learned a lot, such as which parts of the genome are involved in certain diseases. The next step, the part that scientists all over the world are currently working on, is to determine what every part of the genome does, a process called "annotation." And once they figure that out, they’ll have a better understanding of how different diseases work and the types of treatments and medicines needed to treat them.
If you liked today’s episode, you can become a fan of Everyday Einstein on Facebook or follow me on Twitter. If you have a question that you’d like to see on a future episode, send me an email at firstname.lastname@example.org.
DNA image from Shutterstock