Human Genome

Organisation of the Genome

The human genome is contained on 22 pairs of somatic chromosomes and a pair of sex chromosomes (XX in women, XY in men). Each chromosome is a single strand of DNA, and they are numbered from largest to smallest. A gene is a string of DNA that codes for a specific protein or RNA sequence.
The DNA consists of:

  • Exons, which code for proteins or RNA molecules
  • Introns, which are transcribed from DNA to mRNA when the genome is read, but subsequently spliced out during mRNA processing
  • Regulatory sequences which enhance or reduce expression of their related genes
  • Pseudogenes and retropseudogenes, which are unexpressed genes. Pseudogenes contain exons whereas retrospeudogenes do not.
  • Repeats, which may be of different sizes, are found throughout the genome with no known function
  • Transposons are parts of DNA that can move about between the genome, but serve no known function

Regulation of Gene Expression

All cells share the same genetic code, but are able to differentiate into vastly different forms. This is due to control of gene expression.

Basic regulation

Every gene has a regulatory region, which may or may not be near the gene itself, usually consisting of a specific DNA sequence that is recognisable by a regulator protein. Regulator proteins may allow gene transcription (promoter) or inhibit transcription (inhibitor) by altering the structure of the DNA molecule. Regulatory regions may be longer than the gene itself.

When a promoter binds to a regulatory region, it often acts as an anchor or primer for RNA transcriptase, the enzyme which can 'read' the genetic code and transfer it to an RNA molecule for further modification. Promoters also cause alteration of the histones surrounding DNA so the desired sequence is able to be read. Many regulatory regions can bind numerous regulator proteins, allowing for fine control of gene activation.

Regulation of Regulators

The regulatory proteins must be active and located in the nucleus to function. The levels of these proteins is often under the control of external environmental signals. For example, MYC is a regulatory protein that promotes progression through the cell cycle. MYC may be activated through stimulation of members of the ERBB2 family of cell membrane receptors (including ERBB1 (EGFR) and ERBB2 (HER2)), which initiate a cascade of interactions through the RAF signal transduction pathway. These mechanisms are often used by malignant cells to control which genes are expressed.

Post-transcriptional Regulation

Aside from the complex interactions of gene regulatory regions and proteins, it is possible for a cell to alter the expression of genes through post-transcriptional modification. This can include termination of transcription (attenuation), alteration or destruction of mRNA before it can be read, prevention of mRNA from leaving the nucleus, or binding of small micro-RNA molecules to the mRNA. Eukaryotic cells rapidly destroy double stranded RNA (with good reason) making this another method of regulating expression of genes.


There are many, many ways in which the cell can control gene expression. The most studied aspect of gene regulation is of regulatory regions of the DNA and the proteins which can bind to these. The presence of these proteins in the cell is controlled by external and internal cellular signals. The cell can also modify the expression of genes by altering mRNA after it has been read from the genome.

The Human Genome Project