Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Wednesday, December 16, 2009

one lane is not enough....

Without giving too much away about the stuff I'm working on - trying to study anything of interest in one lane of RNA-Seq data is futile. Do not try this at home kids.

Now, the question is, how valid is it to compare 36bp reads to 72bp reads? Ah, the joys of research.

Labels: ,

4 Comments:

Blogger graveley said...

Agreed...You can figure out if actin and tubulin are expressed, but I guess that doesn't really fall in the category of "interesting". We shoot for 20 million 76 bp reads or 40 million 36 bp reads per sample at a minimum.

But amazingly, people publish papers with this much data and draw conclusions about it....

December 16, 2009 3:43:00 PM PST  
Blogger Nat said...

Without giving away what you are working on, can you elaborate on this point at all?

We recently came to the conclusion (based on some data collection and recent publications) that one lane should give us data that is way better than a microarray and approaching the quality of RT-PCR but not enough for good alternative splicing detection. So, are we completely wrong? We're about to spend a lot of money, so we need all the advice we can get if we are wrong in our assumptions!

December 17, 2009 12:16:00 AM PST  
Blogger sandmann said...

I guess you are all referring to the human genome. Looking at a smaller genome, e.g. that of the fruitfly, may require fewer reads, as there are fewer genes in the genome. Even more important, though, may be the question how large the dynamic range of RNA expression is. Do actin/tubulin e.g. make up 25% of all transcripts or more like 10% ? Finally, the origin of the RNA may be worth considering. A homogeneous population of cells, e.g. from tissue culture, will most likely require fewer reads to characterize than e.g. a tumor biopsy. You are raising a very interesting and important question, but I would presume that the answer will be very different in different experiments. Much more different than for genomic DNA.

December 17, 2009 12:49:00 AM PST  
Blogger graveley said...

How much data you need depends entirely on what information you want to get out of it - and the more important thing is the number of reads, length of the reads, and whether or not they are single or paired-ends. If you want SNPs, one lane is not enough. If you want splicing information for most transcripts, 1 lane is not enough. If you want to simply determine expression levels, 1 lane may get you there, but you won't get accurate info on low abundance transcripts. Also, with regards to the human vs. fly, you actually need about the same amount of sequence for both as the size of the transcriptomes for each is about the same, as are most metazoans. Genome sequence is a whole different ball game.

December 20, 2009 4:06:00 PM PST  

Post a Comment

<< Home