Erin Pleasance, Welcome Trust Sanger Institute - “Whole Cancer Genome Sequencing and Identification of Somatic Mutations”
Goals of cancer genome sequencing: WGSS read sequencing. Detection of substitutions, indels, rearragements, cnv. Detection of coding and non-coding genomes. Catalog of somatic mutaitons, functional impact and mutational patterns. Drivers vs. Passengers.
Talk tonight about one cancer and one matched normal genome.
NCI-H209 small cell lung cancer cell line.
Cancer dcell line and non-cancer cell line derived from same individual.
Prior sequencing by PCR and capilary.
Somatic mutations: 6/Mb, or 18,000 in genome.
Other data also available: affy SNP6 and expression arrays.
Show karyotype. Kinda funky, but mostly sane. (-:
used AB SOLiD machine... strategy is pretty obvious: sequence cancer and matched normal. All PET, and aligned with MAQ, corona for substitutions.
How much sequence do you need to do? Turns out, you need equal amounts of both – and it's about 30X coverage. There is a GC effect on coverage.
Compare with dbSNP. About 80% are there.
Look at tumour only, with simple filtered reads: about 50% are not in dbSNP. Many are probably germline variants. Mutations vs SNP rate: Need to call SNPs and mutations using control at the same time to get best results. As well, if you have greater than diploid chromosomes, you need to worry about that too.
Also: CNV changes and ploidy, normal cell contamination, base qualities, and it's important to do indel detection first.
CNV: easy to obtain, and cleaner than array data.
Structural variants from paired read, do it genome wide. 50 of mutants interrupt genes (of 125 in tumour only.)
Rearrangements: can also look at that. (Saw many rearrangement events).
Structural variants at basepair resolution. (Using Velvet... good job Daniel).
Last thing of interest: Small indels (less than 10bp.) Paired end reads, anchor with one end.
Medium indels can be found by identifying deviation in insert size (Heather Peckham). You can see a shift in size... not an actual significant change. [interesting method] Can be seen in comparison between normal and tumour.
To summarize: somatic variants throughout the genome. Circos plots (=
Somatic mutations, functional impact? Recurrence? Pathways?
Talk tonight about one cancer and one matched normal genome.
NCI-H209 small cell lung cancer cell line.
Cancer dcell line and non-cancer cell line derived from same individual.
Prior sequencing by PCR and capilary.
Somatic mutations: 6/Mb, or 18,000 in genome.
Other data also available: affy SNP6 and expression arrays.
Show karyotype. Kinda funky, but mostly sane. (-:
used AB SOLiD machine... strategy is pretty obvious: sequence cancer and matched normal. All PET, and aligned with MAQ, corona for substitutions.
How much sequence do you need to do? Turns out, you need equal amounts of both – and it's about 30X coverage. There is a GC effect on coverage.
Compare with dbSNP. About 80% are there.
Look at tumour only, with simple filtered reads: about 50% are not in dbSNP. Many are probably germline variants. Mutations vs SNP rate: Need to call SNPs and mutations using control at the same time to get best results. As well, if you have greater than diploid chromosomes, you need to worry about that too.
Also: CNV changes and ploidy, normal cell contamination, base qualities, and it's important to do indel detection first.
CNV: easy to obtain, and cleaner than array data.
Structural variants from paired read, do it genome wide. 50 of mutants interrupt genes (of 125 in tumour only.)
Rearrangements: can also look at that. (Saw many rearrangement events).
Structural variants at basepair resolution. (Using Velvet... good job Daniel).
Last thing of interest: Small indels (less than 10bp.) Paired end reads, anchor with one end.
Medium indels can be found by identifying deviation in insert size (Heather Peckham). You can see a shift in size... not an actual significant change. [interesting method] Can be seen in comparison between normal and tumour.
To summarize: somatic variants throughout the genome. Circos plots (=
Somatic mutations, functional impact? Recurrence? Pathways?
Labels: AGBT 2009
0 Comments:
Post a Comment
<< Home