AGBT 2010 - Elliot Margulies - NHGRI/NIH
Sequencing and analysis of matched tumor and normal genomes from a melanoma patient
Experimental Design:
* melanoma tumor sample - sequence it
* matched normal blood sample - sequence it
* seems simple, but takes new tools.
* unique advertisement strategies. (-;
Saved 10 runs of Images alone - more than 100 Tb of storage
Compare Illumina 1.6 v 1.4
* Uniquely aligning read and next_phred
* Didn't explain the results of the graphs shown... missed the point.
Used Eland, partition into bins
* realign with xmatch. (well characterized and scales well.)
In the end, 2 whole genome datsets
* 2 x 100 bp read
* 33 tumour and 24 normal (lanes)
* total runs (5 and 3)
* total alignable reads 1billion/1.2billion
Coverage statistics:
* Greater than 99% covered 1x
* 5x-10x range for variants covered by 94-95%
Method for variant detection
* Most Probable Genotype
* bayesian statistic approach, prior probability of observing a non-ref allele (expected mutation rate)
* Equation given - not going to copy that for html.
* Confidence is the difference between the best call and the next most probable call.
[This looks VERY much like SNVMix2...]
Graph concordance with percentage called. If you use a cutoff of 10, you get 95% in the normal genome, 90% in the tumor.
Moved from MPG to Most Probable Variant (MPV)
* Compare between the best call and the probability of the reference data.
* improves the quality of the call.
Settings:
* Using MPV greater than 10 (4Million variants)
* Subtract out evidence for germ line or low coverage
** take out high confidence gernline variants
** subtract MPG is less than 10, but looks like a variant.
** throw out low confidence somatic variants.
* leaves 189,000 somatic variants (tumour variants)
* also filtering dbsnp
* break into coding/non-coding
* synonymous/non-synonymous
* verify SNVs by sanger sequencing. (75/84 verify) It may be that some of them are there, but not detectable by sanger.
Summary table of SNV pipeline.
* 174,000 non coding variants.
Paper: Local DNA Topography correlates with functional noncoding regions of the human genome.
Impact on SNPs on Local DNA Structure - sometimes this can change the structure alot.
Use "Chai" to do structure informed evolutionary information
* only about 10,000 overlap "chai" regions
* 2,176 appear to dramatically change DNA shape.
"Chai" spots are "mutation cold spots"
Future plans, look at more tumor normal pairs, and investigate it further.
Labels: AGBT 2010
0 Comments:
Post a Comment
<< Home