Eric Lander, Broad Institute of Harvard and MIT – The New World of Genome Sequencing
Dramatic increases in data production – we now expect exponential increases in productivity, and exponential decreases in cost. Next Gen sequencing is where all of this comes from – thanks to the major players in the field. 2Gb a day is where we're going now – but by the next meeting it will have changed well beyond that.
We should now consider sequencing as a general purpose tool, the same way that we used to consider computers as a specific problem solving tool, but now consider them broadly applicable. We now should use general purpose sequencing devides.
Over view of talks: Epigenmics, variations in known genomes, mutations in cancer, transcriptional proviling, microbes, de novo sequencing.
Alex Meisner gave a good review of Epigenomics yesterday. Two major components: histones & DNA methylation. Yesterday's experiment was ChIP-on-chip, however, we now do ChIP-Seq. Compared to Chip-chip, Chip-Seq is easier and vastly cheaper... and reproducible. It's now taking over the whole field. Chromatin state maps will shortly be the standard, giving us a complete catalog of the epigenomics of all signals in all cell states.
Variations in known sequencing; Our old challenge was to go beyond single gene diseases, but that now we're in a position to take a comprehensive look at the genome. The number of SNPs have “skyrocketed” in the past decade, which allows us to now do 10^6 snps on a chip at one time (2007). However, in the past year, there are now 200+ genes associated with diseases. (He no longer makes the slides that show them.)
“We have barely scratched the surface of genes”.
We do well with high frequency SNPs In contrast, 0.5-5% frequency genes are poorly studied. We need to do more sequencing to get at these patients. However, we're now seeing sequences from individuals being resequenced, but we'll start seeing much much more than that in the future – 100's and 1000's of people being resequenced. (Woo.. neanderthal and wooly mamoth shout out.)
A neat example of this is the sequencing of people who interact differently with TB and TB drugs. Apparently there are only about 40 differences that seem to be involved. Another example is stickleback sequencing – where a single lane of illumina is enough to genotype an individual fish.
Cancer genomes: Cancer genome Atlas project was formed, and much handwringing has followed about how much we do know, and how much we don't know. What starts to appear, however, when sequencing begins to shed light on it is that very clear signals appear showing what is involved. New genes, clear breakpoints... and all of this is leading to pathways in cancer. Cool. We now phrase our work in terms of what pathways are hit in which cancer, not single genes being mutated. “Dissecting cancer will require sequencing of 1000's of individuals.” However, we still need to worry about error rates, which haven't yet come down to the sensitivity of the tests we need to do.
WTSS: microRNA, ab initio construction of transcriptomes.... not much said here.
Microbes: we're now sequencing microbiomes for use in energy harvest... again just a quick acknoledgement of the field.
De Novo Genome Assembly: we're still working on it.
This was a quick “whirlwind” tour of what's going on in the field. In this new world of sequencing, will we find completely new phenomenon?
Long intergenic non-coding RNAs (lincRNAs) - the paper just came out this weekend. Extensive transcription in mammals. We now have a better idea of what's being transcribed using the new technology. “Are most non-coding RNA transcripts functional”? (Reviewing various perspectives on it.) Apparently, only about a dozen functional large ncRNAs are known. They are now using epigenomics to figure out what's going on. - use the ones that have the expressed gene marks... there are now 1600 novel sigatures that were not known as protein-coding genes. (Characterizing intergenic K4-k36 domains). So what do these things do biologically? They catalog expression patterns, and can associate with pathway profiles... etc. Profiling and correlation is the key to solving this mystery, and they all clearly suggest a biological role in the cell.
Eg, some of them are clearly regulated by p53. This seems to be the potential repressor of other genes, when p53 wants to down regulate other genes, or upregulate with others.. How? Possibly through Polycomb repressor compex? They're anti-transcription factors! Nifty.
“> 50% of lincRNAs expressed in various cell types bind Polycomb or other factors.” “Suggests whole world of gene regulators!”
My Comments: Wow, that was a pretty decent opening talk. The overview was well done, focusing on the challenges, but without dwelling on the problems. The final part of the talk was focussed on the recent paper on lincRNA, which sounds really intersting. I'm quite interested in following up on that paper. Good timing on having it out before AGBT. (-;
We should now consider sequencing as a general purpose tool, the same way that we used to consider computers as a specific problem solving tool, but now consider them broadly applicable. We now should use general purpose sequencing devides.
Over view of talks: Epigenmics, variations in known genomes, mutations in cancer, transcriptional proviling, microbes, de novo sequencing.
Alex Meisner gave a good review of Epigenomics yesterday. Two major components: histones & DNA methylation. Yesterday's experiment was ChIP-on-chip, however, we now do ChIP-Seq. Compared to Chip-chip, Chip-Seq is easier and vastly cheaper... and reproducible. It's now taking over the whole field. Chromatin state maps will shortly be the standard, giving us a complete catalog of the epigenomics of all signals in all cell states.
Variations in known sequencing; Our old challenge was to go beyond single gene diseases, but that now we're in a position to take a comprehensive look at the genome. The number of SNPs have “skyrocketed” in the past decade, which allows us to now do 10^6 snps on a chip at one time (2007). However, in the past year, there are now 200+ genes associated with diseases. (He no longer makes the slides that show them.)
“We have barely scratched the surface of genes”.
We do well with high frequency SNPs In contrast, 0.5-5% frequency genes are poorly studied. We need to do more sequencing to get at these patients. However, we're now seeing sequences from individuals being resequenced, but we'll start seeing much much more than that in the future – 100's and 1000's of people being resequenced. (Woo.. neanderthal and wooly mamoth shout out.)
A neat example of this is the sequencing of people who interact differently with TB and TB drugs. Apparently there are only about 40 differences that seem to be involved. Another example is stickleback sequencing – where a single lane of illumina is enough to genotype an individual fish.
Cancer genomes: Cancer genome Atlas project was formed, and much handwringing has followed about how much we do know, and how much we don't know. What starts to appear, however, when sequencing begins to shed light on it is that very clear signals appear showing what is involved. New genes, clear breakpoints... and all of this is leading to pathways in cancer. Cool. We now phrase our work in terms of what pathways are hit in which cancer, not single genes being mutated. “Dissecting cancer will require sequencing of 1000's of individuals.” However, we still need to worry about error rates, which haven't yet come down to the sensitivity of the tests we need to do.
WTSS: microRNA, ab initio construction of transcriptomes.... not much said here.
Microbes: we're now sequencing microbiomes for use in energy harvest... again just a quick acknoledgement of the field.
De Novo Genome Assembly: we're still working on it.
This was a quick “whirlwind” tour of what's going on in the field. In this new world of sequencing, will we find completely new phenomenon?
Long intergenic non-coding RNAs (lincRNAs) - the paper just came out this weekend. Extensive transcription in mammals. We now have a better idea of what's being transcribed using the new technology. “Are most non-coding RNA transcripts functional”? (Reviewing various perspectives on it.) Apparently, only about a dozen functional large ncRNAs are known. They are now using epigenomics to figure out what's going on. - use the ones that have the expressed gene marks... there are now 1600 novel sigatures that were not known as protein-coding genes. (Characterizing intergenic K4-k36 domains). So what do these things do biologically? They catalog expression patterns, and can associate with pathway profiles... etc. Profiling and correlation is the key to solving this mystery, and they all clearly suggest a biological role in the cell.
Eg, some of them are clearly regulated by p53. This seems to be the potential repressor of other genes, when p53 wants to down regulate other genes, or upregulate with others.. How? Possibly through Polycomb repressor compex? They're anti-transcription factors! Nifty.
“> 50% of lincRNAs expressed in various cell types bind Polycomb or other factors.” “Suggests whole world of gene regulators!”
My Comments: Wow, that was a pretty decent opening talk. The overview was well done, focusing on the challenges, but without dwelling on the problems. The final part of the talk was focussed on the recent paper on lincRNA, which sounds really intersting. I'm quite interested in following up on that paper. Good timing on having it out before AGBT. (-;
The first question is about Eric Lander's selection to be part of the Obama "team" on science. Spiffy! (Quick pump for the hope that Obama stays in power for 8 years... hehe.) Apparently, his first question with the science group is "what has happened since the sequencing of the human genome project, and is progress going as fast as expected"? (paraphrased, of course.)
Labels: AGBT 2009
0 Comments:
Post a Comment
<< Home