Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Monday, July 27, 2009

how recently was your sample sequenced?

One more blog for the day. I was postponing writing this one because it's been driving me nuts, and I thought I might be able to work around it... but clearly I can't.

With all the work I've put into the controls and compares in FindPeaks, I thought I was finally clear of the bugs and pains of working on the software itself - and I think I am. Unfortunately, what I didn't count on was that the data sets themselves may not be amenable to this analysis.

My control finally came off the sequencer a couple weeks ago, and I've been working with it for various analyses (snps and the like - it's a WTSS data set)... and I finally plugged it into my FindPeaks/FindFeatures pipeline. Unfortunately, while the analysis is good, the sample itself is looking pretty bad. In looking at the data sets, the only thing I can figure is that the year and a half of sequencing chemistry changes has made a big impact on the number of aligning reads and the quality of the reads obtained. I no longer get a linear correlation between the two libraries - it looks partly sigmoidal.

Unfortunately, there's nothing to do except re-seqeunce the sample. But really, I guess that makes sense. If you're doing a comparison between two data-sets, you need them to have as few differences as possible.

I just never realized that the time between samples also needed to be controlled. Now I have a new question when I review papers: How much time elapsed between the sequencing of your sample and it's control?

Labels: , , , , , ,

Thursday, April 9, 2009

BC Genome Forum 2009

I had a lot of stuff to blog about, but just haven't had the time to write any of it down. I haven't forgotten about any of it, but it's just not going to happen before this weekend. I'm currently bogged down in debugging something that I REALLY want to get working (and mostly is, but still has something slightly fishy going on...), and just too much going on outside of work to get it done otherwise.

Still, I figured I should mention a few things of interest before I forget to discuss them at all.

I attended some of the BC Genome forum lectures on Friday. I skipped the morning, since they seemed mainly irrelevant to anything I do - which was later confirmed - but caught the session on personal medicine. For the most part, it was focused the ethics of personal medicine. I was considering blogging those talks, but they just didn't have enough interest factor, individually.

For the most part, the speakers were caught in a pre-2006 time warp. Everything was about micro-arrays. One of the speakers even said something to the effect of "maybe one day we'll be able to sequence the whole human genome for patients, but we're no where near that yet." Needless to say my coleagues and I all exchanged startled glances.

Still, there were a few things of interest: There was a lot of discussion about what conditions you find in genomic screens that you should feel obligated to discuss with the DNA donor. They gave the example of one volunteer to tested positive for a condition that could be life threatening if they were to undergo surgery for any reason. It's easily treated, and can be easily managed - if you're aware of it. Since the donor was in the healthy control group, they were clearly unaware that they had the condition. In this condition, where the donor was clearly at risk, the penetrance of the gene is 100%, and the patient could clearly do something about the problem, it was "a no-brainer" that the donor should be notified.

However, for most of the information people are pulling from arrays, it's not always clear if the ethics tilt so heavily towards breaking confidentiality and reporting information to the patient. How this type of situation should be managed was touched upon by several of the speakers. The best solution we'd heard during the forum was one group who had set up an advisory board who sits down on a yearly/6-month basis to determine which - if any - conditions should be returned to the donors.

Unfortunately, no one described the criteria used to make that decision, but the concept is pretty solid.

The surprising thing for me was that after several years of using mechanisms like this, only 12-20 conditions were being returned. In the world of genomics, that's a VERY small number, but is probably more representative of the fact that they're using arrays to do genome screens.

And that is one of the reasons why it felt like 2006 all over again. All the mechanisms they've put in place are fine when you're talking about a couple of nnew conditions being screened each year. Within 2 years we'll be routinely doing whole genome sequencing with Pacific Biosciences SMRT (or the equivalent) systems, and whole genome association studies will become vastly more plentiful and powerful. Thus, when your independent board gets 1200 candidate diagnostic genes with actionable outcomes per year, that mechanism won't fly.

What's realy needed (in my humble opinion) is for a national board to be created in each country to determine what gene information should be disseminated as useful and actionable - possibly as part of the FDA in the states. That would also be very useful for reining in companies like 23andMe and the like... but that's another story altogether.

Moving along, there were a few other interesting things at the event. My personal favorite was from the Smit lab in the microbiology & immunology department at UBC, presented by Dr. John Nomelini, who I know from my days in the same department. They have a pretty cool system, based on the caulobacter bacterial system, where they can pull down antibodies (similarly to streptavadin beads) but using a much cheaper and easier system. While I don't know the legal issues around the university's licencing of the technology, Dr Nomelini is trying to find people interested in using the technology for ChIP experiments. I've proposed the idea to a few people here to test it out on ChIP-Seq, which would help bring the cost down by a few $100. We'll see if it gets off the ground.

So, if you've made it this far, hopefully you've gleaned something useful from this rambling post. I have some coding to do before my parents arrive for the easter weekend. Time to get back to debugging...

Labels: ,

Thursday, March 12, 2009

Personal Medicine... is it worthwhile?

After the symposium yesterday, and several more insightful comments, I thought I should write a couple of quick points.

One of the main issues is penetrance, or how often the disease occurs when you have a given genomic profile. For some diseases, like Huntington's disease, having the particular mutation translates directly into a certainty that you will have the disease. There really isn't much of a chance that you'll somehow avoid developing it. For other diseases, a gene may change your likelihood of developing the disease slightly or in an almost un-noticeable way. In fact, sometimes you may have offsetting changes that negate what would be a risk factor in another person. Genomes are wild and complex data structures, and are definitely not digital in the sense that seeing a particular variation will always give you a certain result.

Mainly, that has to do with the biology of the cell. There are often redundant pathways to accomplish a given task, or several levels of regulation that can be called on to turn genes on or off. Off the top of my head, I can think of several levels of regulation (dna methylation, histone post-translational modifications, enhancers, promotors, microrna, ubiquitination leading to increased degradation, splicing, mis-folding through chaperonin regulation, etc) that can be used to fine tune or throttle the systems in any given cell. At that rate, looking at a single variation seems like it might be an entirely useless venture.

And, in fact, that was the general consensus of the panelists last night: the companies that currently run a microarray on your dna and then report to you some slight changes in risk factors are really a waste of time - they don't begin to compensate for the complexity that is really going on.

However, my contention isn't that we should be doing personal medicine over the whole genome, but that as we move forward, that personal medicine will have a large and growing impact over how healthcare is practiced. I've heard several people talk about Warfarin as an example of this. Warfarin is used to treat hypertension, and is quite effective in most people. However, each person has different dosage requirements - not because they need more to activate the pathway, but because we all degrade it at different rates, depending on which p450 enzymes we have to break it down.



In the above graph, you can see all patients conform to some "normal" distribution, but they're really made up of two subpopulations - one set of fast metabolizers and one set of slow metabolizers, as judged by metabolism of some other drugs. (Yes, I'm way oversimplifying how this works - this is not real warfarin data!) When you look at the spectrum of patients that come in, you see a continuum of patient dosages, but you'd never understand why.

Instead, you could look for markers. In the case of drug metabolism, only one p450 may be responsible for the speed at which the drug is processed, so looking at the same group of patients for that particular trait will give you a completely different graph:



Which means, you can start to figure out what initial dose will be required, and tweak from there.

(If you're wondering why the fast metabolizers and slow metabolizers of the same drug have some overlap in my example, it's just so I'd have an excuse to say there are probably other factors involved: environment, other things interfering with the metabolism, the rate at which the kidneys clear the drugs... and probably many other things I've never considered.)

So what's my point? It's easy. Personal medicine isn't about whole genomics, but rather about finding out what conditions underly the complex behaviours of the body - and then to apply that knowledge as best as we can to treat people. (Whole Genome Studies will be important to learning how these things work though, so without the ability to do whole genome sequencing, we wouldn't have a chance at really making personal medicine effective.) I'll be the first to admit we don't know enough to do this for all diseases, but we certainly do know enough to begin applying it to a few. I've argued that within 5 years, we'll start to really see the effects. It won't be a radical change to all medical care at once, but a slow progression into the clinics.

To narrow my prediction down further, at some point in the next 5 years, it will become routine (~10-20% of patients?) for doctors to start doing genomic tests (not full genome sequencing!) to apply this type of knowledge when they treat their patients with new drugs. (Not every illness will require genomic information, so clearly we'll never reach 100% requirement for it - having a splinter removed in the E.R. won't require the doc to check your genome...) I give it another 10 years before full genome sequencing begins hitting clinics.. and even that will be a gradual change.

Now I've really wandered far outside of my field. I'll let the doctors and physicians handle it from here and try to restrict my comments to the more scientific aspects of it.

Labels: ,