I've spent the last week madly putting together a poster for the "Reasons for Hope 2008" conference this past weekend, which focuses on breast cancer science, treatment and quality of life research. So, you'll notice (shortly), a new poster in my poster section. It was a educational experience, and I must admit I learned a lot. Not so much in the areas that I need to learn for my own research, but about physiology, psychology and general health research. And that's even considering how few talks I went to!
Still, I highly recommend dropping into talks that aren't in your field, on occasion. I try to make a habit of it, which included a pathology lecture just before xmas, last year, and this time, I learned a lot about mammography, and new techniques for mammography that are up and coming. Neither are really practical skills for a bioinformatician, but it gives me a good idea of where the samples I'll be dealing with come from. Nifty.
Anyhow, I had a few minutes to revisit my ChIP-Seq code, FindPeaks, and do a few things I'd been hoping to do for a while. I got around to reducing the memory requirement - going from about 4Gb of RAM for a 12M+ read run down to under 1Gb. (I'd discussed this before in another posting.) The other thing I did was to re-write the core peak-finding algorithm. It was something I'd known was not-optimal for a while, but re-implementing a core routine isn't something you do without a lot of thought. The good news, it runs about 2x as fast, scales better on multiple cores and guarantees not to produce any of the type of bugs that have been relatively common in early versions of FindPeaks.
Having invested the 2 hours to do it, I'm very glad to see it provide some return. Since my next project is to clean up the Transcripter code (for whole transcriptome shotgun sequencing), this was a nice lesson in coding: if you find a problem, don't patch the problem: solve it. I think I have a lot of "solving" to do. (-;
For those of you who are interested, the next version of FindPeaks will be released once I can include support for the SRF files - hopefully the end of the week.
Still, I highly recommend dropping into talks that aren't in your field, on occasion. I try to make a habit of it, which included a pathology lecture just before xmas, last year, and this time, I learned a lot about mammography, and new techniques for mammography that are up and coming. Neither are really practical skills for a bioinformatician, but it gives me a good idea of where the samples I'll be dealing with come from. Nifty.
Anyhow, I had a few minutes to revisit my ChIP-Seq code, FindPeaks, and do a few things I'd been hoping to do for a while. I got around to reducing the memory requirement - going from about 4Gb of RAM for a 12M+ read run down to under 1Gb. (I'd discussed this before in another posting.) The other thing I did was to re-write the core peak-finding algorithm. It was something I'd known was not-optimal for a while, but re-implementing a core routine isn't something you do without a lot of thought. The good news, it runs about 2x as fast, scales better on multiple cores and guarantees not to produce any of the type of bugs that have been relatively common in early versions of FindPeaks.
Having invested the 2 hours to do it, I'm very glad to see it provide some return. Since my next project is to clean up the Transcripter code (for whole transcriptome shotgun sequencing), this was a nice lesson in coding: if you find a problem, don't patch the problem: solve it. I think I have a lot of "solving" to do. (-;
For those of you who are interested, the next version of FindPeaks will be released once I can include support for the SRF files - hopefully the end of the week.
Labels: breast cancer, Chip-Seq, conferences, transcriptome