The human body is formed in complex and profound ways. Its basic composition is identical, but each person is unique. Let us explore the world of genomics together and get to know more about how intricate the human body is.
What is DNA?
Deoxyribonucleic acid, or DNA in short, is a substance that exists in our human cells. Its structure consists of two long spiral chains (as shown below), on which four types of bases, which can be thought of like English letters: adenine (A), thymine (T), cytosine (C) , and guanine (G), serve as the basic unit of DNA. Our body relies on the order, or sequence, of these DNA bases on the chain to carry genetic information. This is similar to how we communicate in English by using the 26 letters in the alphabet to form words and sentences to convey different meanings.
What is a gene?
A gene is a segment of DNA that serves as the basic unit of instruction for building our body’s observable traits and functions.
The DNA bases are linked together in a chain to form a DNA strand. Only certain sections of the chain contain genetic instructions to build our body’s observable traits and functions. By analogy, a gene is like a complete sentence or paragraph, which conveys particular meaning.
What is a genome?
A “genome” refers to the entire set of genetic materials in a living thing. For every human being, the genome includes over 20,000 genes. Each of us has our own unique genome, which can be thought of as an operation manual for our body. It provides instructions that help to determine physical characteristics of our body such as skin colour, height and potential risks of developing different diseases.
If we think of the DNA bases as letters of the English alphabet, a gene is like a complete sentence or paragraph, and the genome is like the whole instruction manual – including all the chapters, paragraphs, sentences, graphics and even typesetting.
If we were to compare the genomes of two persons who are not related by blood, 99.9% of their genomes would be the same and only 0.1% of them would be different. However small this 0.1% might look like, it is significant enough to represent millions of differences in the two DNA sequences, accounting for each person’s unique physical characteristics.
Genome | English |
---|---|
DNA bases: A, T, C and G |
English alphabet |
The four bases are connected in a long DNA chain. Genes are segments of the DNA chain that carry specific genetic instructions for the body. |
Letters can be combined to form complete sentences or paragraphs, to convey a specific message. |
All DNA in the human body, including every gene and DNA base, are collectively called the genome. |
All the essays, paragraphs and sentences come together to form a whole book. |
“Whole Genome Sequencing” (WGS) technology enables us to understand more about the human genome. It includes three main steps:
1. Sequencing
As the name suggests, sequencing is the process of reading 3 billion DNA bases in the human body one by one in order.
2. Data processing
After obtaining a large amount of genomic data from sequencing, bioinformaticians have to transform them into interpretable information. They have to compare the sequenced DNA with the “Human Reference Genome”, then detect the differences between the two. These differences are called genetic variants. There are millions of genetic variants in our body, and most of which are not disease-causing. On the contrary, they are what make us unique.
3. Data Analysis
Data Analysis is the most complex and time-consuming step. It involves the identification of disease-causing variants, also called pathogenic/likely pathogenic variants, among millions of genetic variants. It requires not only the knowledge and advancement in genomic science and genomic medicine, but also a large number of cases for long-term analysis and research.
Genomics — beginning with the “Human Genome Project”
In 1990, scientists from different countries began to study all the DNA in the human body, and launched a large-scale scientific exploration project – the “Human Genome Project”. The entire study took more than a decade and the first human genome blueprint was completed in 2003.
Scientists combined the anonymised data from 13 volunteers to form a “human reference genome” for use. Researchers from around the world can now compare data from sequencing an individual’s DNA with this “human reference genome” to identify differences, i.e. genetic variants.
The completion of the “Human Genome Project” opened the door to a new frontier in medical and scientific fields, one that could be used to advance genomics as a science.
From “Genetics” to “Genomics”
In the past, people studying genetics would investigate the structure and function of genes one by one. Since the “Human Genome Project” was completed and Whole Genome Sequencing technology has become mature, genomics has become a frontier in scientific research. Instead of probing genes one by one, scientists can now examine the entire genome and unravel the links, interactions, and roles among different regions of the genome.
The importance of genomic data
Since the completion of the first blueprint of human genome, genomics has quickly blossomed as a field, but scientists have yet to fully resolve the complex relationships between the genome and diseases. Therefore, more case studies with patients and genomic data are vital for research development.
To date, most of the genomic research data came from people in Western countries with European ancestry, which may not be the most useful reference to patients in Hong Kong. To better benefit patients and their families in Hong Kong, HKGP hopes to establish a genome database of local population. This effort will also promote local genomics development as scientists and clinicians reveal and understand more links between genome and diseases.