At the Seventh International Family Tree DNA Conference for Group Administrators,
David Pike, PhD presented a talk entitled “Phasing & Other Analysis of Family Finder Results”.
David Pike
SOURCE: David Pike (Houston, Harris County, Texas); photographed by Stephen J. Danko on 05 November 2011.
David has written a number of utilities for processing unzipped autosomal DNA files from either Family Tree DNA or 23 and Me. The utilities may be found at http://www.math.mun.ca/~dapike/FF23utils/.
Phasing entails separating a child’s alleles inherited from the father from those inherited from the mother. It is one of the more informative used of DNA data and it enables the reconstruction of ancestral chromosomes.
Example: At a particular locus, the mother’s DNA shows that she has an A and a T, the father’s DNA shows that he has a T and a T, and their child’s DNA shows that (s)he has an A and a T. The child must have inherited the A from the mother, since the father does not have an A at either allele. Therefore, the child must have inherited the T from the father.
If the child has runs of homozygosity (a sequence of bases that are the same in each of the two alleles) such as:
GAGAGCAC
GGGAGCAG
(where A = adenine, C = cytosine, G = guanine, and T = thymine) there may be evidence that the parents are related to each other.
In analyzing the DNA, one may find a discordance. Microdeletions and copy number variations may result in some SNPs having only one allele, in which case the allele is reported twice.
DNA results are reported with 99.99% accuracy. Still, this means that 1 in 10,000 SNPs are miscalled, resulting in genotyping errors.
One may discover actual mutations – de novo mutations in the child.
A child may have a longer match with a cousin than the parent does due to false matches that coincidentally match the cousin.
Phasing a child is straightforward when raw data files are available for the child and both parents. Runs of homozygosity (ROHs) and matching blocks of DNA between relatives can be used to partially phase DNA. One parent and several siblings can also be phased, as can siblings alone, although phasing works better with data from one or both parents. Results may be enhanced if one sibling has already been phased.
Copyright © 2011 by Stephen J. Danko