Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Saturday, February 7, 2009

Peter Park, Harvard Medical School - “Statistical Issues in ChIP-Seq and its Application to Dosage Compensation in Drosophila”

(brief overview of ChIP-Seq, epigenomics again)

ChIP-Seq not always cost-competitive yet. (can't do it at the same cost as chip-chip)

Issues in analysis:Generate tags, align, remove anomalus, assemble, subtract background, determine binding position, check sequencing depth.

Map tags in strand specific manner: (Like directional flag in Findpeaks). Scoring tags accounting for that profile. Can be incorporated into peak caller.

Do something called Cross-correlation analysis. (look at peaks in both directions.) use this to rescue more tags. Peaks get better if you add good data, and worse if you add bad data. Use it to learn something about histone modification marks. (Tolstorukov et al, Genome Research).

How deep to sequence? 10-12M reads is current. That's one lane on illumina, but is it enough? What quality metric is important? Clearly this depends on the marks you're seeing (narrow vs broad, noise, etc). Brings you to saturation analysis? Show no saturation for STAT1, CTCF, NRSF. [not a surprise, we knew that a year ago... We're already using this analysis method, however, as you add new reads, you add new sites, so you have to threshold to make sure you don't keep adding new peaks that are insignificant. Oh, he just said that. Ok, then.]

Talking about using “fold enrichment” to show saturation. This allows you to estimate how many tags you need to get a certain tag enrichment ratio.

See paper they published last year.

Next topic: Dosage compensation.

(Background on what dosage compensation is.)

In drosophila, the X chromosome is up-regulated in XY, unlike in humans, where the 2nd copy of the X is quashed in the XX genotype. Several models available. Some evidence that there's something specific and sequence related. Can't find anything easily in ChIP based methods – just too much information. Comparing ChIP-seq, you get sharp enrichment, whereas on ChIP-chip, you don't see it. Seems to be saturation issue (dynamic range) on ChIP-chip, and the sharp enrichments are important.
You get specific motifs.

Deletion and mutation analysis. The motif is necessary and sufficient.

Some issues: Motif on X is enriched, but only by 2-fold. Why is X so much upregulated, then? Seems Histone H3 signals depleted over the entry sites on X chr. May also be other things going on, which aren't known.

Refs: Alekseyenko et al., Cell, 2008 and Sural et al., Nat Str Mol Bio, 2008

Labels:

3 Comments:

Anonymous Anonymous said...

actually chip-seq is cheaper than chip-chip. One lane on an illumina box costs about $600 which is enough to detect chip-seq stuff. Assuming that a protein binds to 10,000 loci on a 3Gb genome and each locus is say 100 bp, then the total bp covered by the 10,000 loci would be 1M bp. One lane of the GA gives about 1Gb so you would sequence the 10,000 loci at 1000x depth. Makes sense to multiplex samples for even lower cost! Other big benefit - chip-seq requires 1/10th or less of the chip-chip DNA amount.

February 8, 2009 5:48:00 PM PST  
Anonymous Anonymous said...

Hi Anthony,

Nice blog! I was just looking for the Cell reference you mentioned, and you actually made a spelling mistake: the author name is Artyom A. Alekseyenko.

Best,

N.

February 9, 2009 2:45:00 AM PST  
Blogger Anthony said...

I tend to agree with the first comment, there are lots of reason why it's worth doing ChIP-Seq over ChIP-chip. I think this is an example of doing the numbers without paying attention to the results. You might get more ChIP-chip results done for the same number of dollars, but I guarantee your results won't be nearly as impressive. On the other hand, I think the numbers are heavily on who's prepping and running the samples.

To the second poster, thanks - I couldn't read what was on the slide very clearly, so thank you for the correction. If you catch other errors, please let me know.

February 9, 2009 11:31:00 AM PST  

Post a Comment

<< Home