Working with raw DNA

Screenshot of raw DNA spreadsheet from

For those of you unfamiliar with raw DNA data, here is a screenshot of the spreadsheet file created when I downloaded my raw DNA file from and opened it in MIcrosoft Excel. explains: Each line corresponds to one single nucleotide polymorphism (SNP) ... A SNP is a single site in the genome that is known to vary across individuals ... The possible observations are A for adenine, C for cytosine, G for guanine, T for thymine, I for insertion’ and ‘D for deletion, or 0 for missing data. Column one provides the identifier (including the #rsID where possible). Columns two and three contain the chromosome and basepair position of the variant using human reference 37.1 coordinates. Columns four and five contain the two alleles observed at this variant (genotype). The specific letters present are called alleles and the pair of alleles observed at a variant is called the genotype.

Courtney Barr