AGP
The Autism Genome Project
"Investigating the genetic basis of autism"

AGP Research: Research Techniques

On this page:

  1. Genome sequencing
  2. Complete genome scans
  3. Linkage analyses
  4. Association analyses
  5. CNV analyses

 

1. Genome sequencing

The DNA that is in each our cells is a double-stranded molecule arranged in a "double helix" structure. It is made up of an estimated 3.2 billion base pairs.

DNA is usually obtained from an individual by drawing a blood sample or by rubbing a cotton swab along the inside of the mouth to harvest cells.

Genome sequencing is the process whereby researchers "read" the specific order of DNA bases (nucleotides) that make up the individual's DNA.  (More here: on genome sequencing)

An entire human genome was sequenced for the first time in 2003. (More here:on the Human Genome Project).

However, to sequence an individual's entire genome is extremely costly and time-consuming.

More often, researchers use genome sequencing to "read" smaller sections of an individual's DNA, such as individual genes or parts of a gene (also referred to as gene mapping).

See also:

2. Complete genome scans

For complex genetic disorders such as autism, complete genome scans are an invaluable tool for researchers seeking out susceptibility genes – namely, genes that increase the likelihood of an individual being affected by that disorder.

It is now estimated that, for two unrelated, healthy individuals, about 99.8% of their DNA will be the same.

Of the remaining genetic differences between individuals:  

Single base-pair changes in the DNA are also known as single nucleotide polymorphisms, or SNPs are estimated to contribute ~84%.

Structural variations of the genome – including CNVs (see section 5 below) are estimated to contribute ~16%

There are an estimated 10 million SNPs that occur commonly in the human genome. (More here: on SNPs). The International HapMap Project aims to to identify and catalogue most of these SNPs.

SNPs can be used as genetic markers, to indicate which version of a gene (allele) an individual carries on his or her DNA, at a given location along a chromosome.

These genetic markers can be thought of as signposts, each 'marking' a particular section of a chromosome, and providing clues to the section of chromosome where a susceptibility gene may possibly reside.

The testing of which specific alleles have been inherited by an individual is called genotyping.

Rather than sequencing entire sections of a genome, genotyping can be carried out by testing selected SNPs at points along the genome.

This is because groups of SNPs that are located near to each other on a chromosome are inherited in blocks (haplotypes). (More here: on tag SNPs.)

In a given sample of human DNA, it is now feasible for researchers to genotype many thousands of SNPs, across all 23 pairs of chromosomes a process referred to as a complete genome scan (or whole-genome scan).

In recent years, significant technological advances have allowed researchers to carry out genotyping ever more rapidly and cheaply. (More here: on Microarray Technology).

For example, the AGP is currently using the Illumina 1M beadchip to genotype over 1 million SNPs in samples of DNA from children affected with autism.

Complete-genome scans are used by researchers to track down the specific locations (loci) on the chromosome where there are differences between individuals affected by a certain disease or disorder and those who are unaffected.

DNA samples are collected from two groups of participants: people affected by the disorder being studied and similar people who are unaffected. The samples are then genotyped for selected genetic markers (SNPs).

Genome-wide association analyses and/or linkage analyses can then be carried out (see below).

3. Linkage analyses

Linkage can be defined as the tendency for genes or sections of DNA positioned near to each other on a chromosome to be inherited together.

Linkage analyses aim to discover the regions in which the DNA from the relatives of people affected by a particular disease or disorder is more similar than would be expected by chance.

The idea behind this is that, if affected individuals from the same family share identical versions of genes leading to the disorder, then these genes are located in regions of increased similarity – in other words, they are linked to the disorder.

By testing a set of genetic markers (or haplotype), researchers can infer the gene(s) that may be linked to the disorder. (More here: on linkage analyses.)

Researchers will typically use a large number of genetic markers to test a sample of families consisting of two or more siblings affected with the disorder (sib pairs), plus both parents.

Statistical analysis will be used to identify haplotypes (or markers), of interest. If particular haplotypes are found to be inherited more often that would be expected by chance, then these loci are said to be in linkage disequilibrium.

Usually, a calculated LOD score (LOD = log of the odds) is used to evaluate whether a particular chromosomal region is linked with the disorder. Typically, a LOD score of more than 3 will be taken as proof of strong linkage.

Researchers often conduct a whole-genome linkage scan as a first step to identifying susceptibility genes for complex diseases.

Once linkage regions have been identified, a fine-mapping can then be carried out to narrow down the search and identify potential susceptibility genes.

Finally, candidate-gene evaluation can be used to identify the specific gene(s) associated with the disorder. (A candidate gene is a gene that encodes a protein that is thought might be responsible for the disorder).

TThere are two main kinds of linkage analysis:

Parametric linkage analysis is used to investigate single-gene disorders, where the parameters of the analysis (for example, mode of inheritance, penetrance) are clear.

Non-parametric linkage analysis is used for complex genetic disorders, where a number of different genes are implicated and the parameters of the analysis are therefore much less clear. Quantitative trait locus (QTL) mapping is one type of non-parametric linkage analysis.

4. Association Analyses

If certain genetic variations are found to be significantly more frequent in people affected by a certain disease or disorder than in those who are unaffected (controls), the variations are said to be associated with the disorder.

For complex genetic disorders, such as autism, where a large number of different genes confer risk, association analyses have more power than linkage analyses. 

Association analyses are now typically performed using a large sample of affected individuals and controls, and testing with a significant number of genetic markers (SNPs).

Association studies may compare either affected and non-affected members within a family (family-based association studies), or affected individuals with unaffected individuals who are not family members (case-control association studies).

An advantage of association analyses over linkage analyses is that it can be used for families where only one child is affected with the disorder in question.

Once identified, associated genetic variations can serve as powerful pointers to the region of the human genome where the disease-causing problem resides.

However, the associated variants themselves may not directly cause the disease. They may just be 'tagging along' with the actual causal variants.

For this reason, researchers often need to take additional steps to identify the exact genetic change involved in the disease, such as genotyping that particular region of the genome with additional markers, or carrying out sequencing (see abve)

5. CNV analyses

In addition to single base-pair changes in the DNA (see section 2 above), structural genetic variations are also important for understanding complex disorders.

One source of structural genetic variation that researchers are increasingly focusing on is Copy Number Variations (CNVs).

CNVs involve deletions or duplications of large chunks of DNA: from thousands to millions of DNA bases. CNVs in a DNA sample can be calculated from the results of complete-genome scans (see above) using microarray technology.

The Wellcome Trust Sanger Institute has recently established a CNV Project database, with the aim to gain a better understanding of the role of CNVs in genetic disorders.

TOP