fejes.ca: February 2009

Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Wednesday, February 25, 2009

Microsoft Sues TomTom over patents

I saw a link to Microsoft suing a Linux-based GPS maker, TomTom, which made me wonder what Microsoft is up to. Some people were saying that this is Microsoft's way of attacking Linux, but I thought not. I figured Microsoft probably has something more sly up it's sleeve.

Actually, I was disappointed.

I went into the legal document (the complaint) to find out what patents Microsoft is suing over... and was astounded by how bad the patents are. Given the recent decision in the Bilski ruling, I think this is really Microsoft looking for a soft target in which it's able to test the waters and see how valid it's patents are in the post-Bilski court environment... Of course, I think these are probably some of Microsoft's softest patents. I have a hard time seeing how any of them will stand up in court. (Aka, pass the obviousness test and, simultaneously, the transformative test proposed in Bilski.)

If Microsoft wins this case, it'll be back to claiming Linux violates 200+ patents. If it loses the case, I'm willing to be we won't hear that particular line of FUD again. I can't imagine any of the 200+ patents it says that Linux violates are any better than the crap it's enforcing here.

Anyhow, for your perusal, if you'd like to see what Microsoft engineers have been patenting in the last decade, here are the 8 that Microsoft is trying to enforce. Happy reading:

6,175,789

Summary: Attaching any form of a computer to a car.

7,045,745

Summary: Giving driving instructions from the perspective of the driver.

6,704,032

Summary: having an interface that lets you scroll and pan around, changing the focus of the scroll.

7,117,286

Summary: A computer that interacts or docks with a car stereo.

6,202,008

Summary: A computer in your car... with Internet access!

5,579,517

Summary: File names that aren't all the same length - in one operating system.

5,578,352

Summary: File names that aren't all the same length - in one operating system... again.

6,256,642

Summary: A file system for flash-erasable, programmable, read-only memory (FEProm).

Overwhelmed by the brilliance at Microsoft yet? (-;

Labels: Linux, Open Source

Friday, February 20, 2009

Gairdner Symposium

Somehow, this year's Gairdner symposium completely managed to escape my notice until today, when a co-worker forwarded it along to me. For those of you who don't know the Gairdner awards, I believe it's roughly the Canadian equivalent to the Swedish Nobel prize, although only for medicine and medical sciences. Since I wasn't aware of it until just a few years ago, I don't think it's has quite the same level of recognition, but the more I look into it, the more I discover that it carries just about as much weight: 78 of the 298 award winners have gone on to win Nobel prizes.

At any rate, this year is the 50th anniversary of the foundation (although there's only been 48 years of prizes, apparently), so they're putting on one heck of a show. The Vancouver symposium will be a three part event at the Chan auditorium at UBC, in which 4 Nobel Laureates will take part in discussions ranging from Personal Medicine to the future of the field of health care.

Anyhow, if you're in Vancouver on March 11th, I highly recommend you get yourself a set of tickets. They're available for free from Ticketmaster. (Yes, they will charge you for free tickets that you still have to print yourself. Ticketmaster blows chickens.) Here are the links:

Session 1: Gairdner Symposium - The Future of Medicine (morning session)
Session 2: Gairdner Symposium - The Future of Medicine (afternoon session)
Session 3: 2009 Michael Smith Forum - Personal Medicine (evening session)

I'm already excited about it!

Labels: events

Wednesday, February 18, 2009

Three lines of Java code you probably don't want to write

Ah, debugging. Ironically, it's one of the programming "skills" I'm good at. In fact, I'm usually able to debug code faster than I can write it - which leads to some interesting workflows for me. Anyhow, today I managed to really mess myself up, which took several hours of rewriting code and debugging to figure out. In the end, it all came down to three lines, which I hadn't looked at carefully enough - any of the 8-10 times I went over that subroutine.

The point was to transfer all of the reads in the local buffer back into the buffer_ahead, preserving the order - they need to be at the front of the queue. The key word here was "all".
In any case, I thought I'd share it as an example of what you shouldn't do. (Does anyone else remember the Berenstain bears books? "Son, this is what you should not do, now let this be a lesson to you.")

Enjoy:

for (int r = 0; r < buffer_ahead.size(); r++) {
    buffer_ahead.add(r, local_buffer.remove(0));
}

Labels: programming

Tuesday, February 17, 2009

FindPeaks 3.3

I have to admit, I'm feeling a little shy about writing on my blog, since the readership jumped in the wake of AGBT. It's one thing to write when you've got 30 people reading your blog... and yet another thing when there are 300 people reading it. I suppose if I can't keep up the high standards I've being trying to maintain, people will stop reading it and then I won't have anything to feel shy about... Either way, I'll keep doing the blog because I'm enjoying the opportunity to write in a less than formal format. I do have a few other projects on the go, as well, which include a few more essays on personal health and next-gen sequencing... I think I'll aim for one "well thought through essay" a week, possibly on Fridays. We'll see if I can manage to squeeze that in as a regular feature from now on.

In addition to blogging, the other thing I'm enjoying these days is the programming I'm doing in Java for FindPeaks 3.3 (which is the unstable version of FindPeaks 4.0.) It's taking a lot longer to get going than I thought it would, but the efforts are starting to pay off. At this point, a full chip-seq experiment (4 lanes of Illumina data + 3 lanes of control data) can be fully processed in about 4-5 minutes. That's a huge difference from the 40 minutes that it would have taken with previous versions, which would have been sample only.

Of course, the ChIP-seq field hasn't stood still, so a lot of this is "catch-up" to the other applications in the field, but I think I've finally gotten it right with the stats. With some luck, this will be much more than just a catch-up release, though. It will probably be a few more days before I produce a 4.0 alpha, but it shouldn't be long, now. Just a couple more bugs to squash. (-;

At any rate, in addition to the above subjects, there are certainly some interesting things going on in the lab, so I'll need to put more time into those projects as well. As a colleague of mine said to me recently, you know you're doing good work when you feel like you're always in panic mode. I guess this is infinitely better than being underworked! In case anyone is looking for me, I'm the guy with his nose pressed to the monitor, fingers flying on the keyboard and the hunched shoulders. (That might not narrow it down that much, I suppose, but it's a start...)

Labels: Chip-Seq

Monday, February 16, 2009

EMBL Advanced Course in Analysis of Short Read Sequencing Data

This email showed up in my mailbox today, and I figured I could pass it along. I don't know anything about it other than what was shown below, but I thought people who read my blog for ChIP-Seq information might find it... well, informative.

I'm not sure where they got my name from. Hopefully it wasn't someone who reads my blog and thought I needed a 1.5 day long course in ChIP-Seq! (-;

At any rate, even if I were interested, putting the workshop in Heidelberg is definitely enough to keep me from going. The flight alone would be about as long as the workshop. Anyhow, here's the email:

Dear colleagues,

We would like to announce a new course that we will be having in June 2009 addressed to bioinformaticians with basic understanding of next generation sequencing data.

The course will cover all processing steps in the analysis of ChIP-Seq and RNA-Seq experiments: base calling, alignment, region identification, integrative bioinformatic and statistical analysis.

It will be a mix of lectures and practical computer exercises (ca. 1.5 days and 1 day, respectively).

Course name: EMBL Advanced Course in Analysis of Short Read Sequencing Data
Location: Heidelberg, Germany
Period: 08 - 10 June 2009
Website: link
Registration link: link
Registration deadline: 31 March 2009

Best wishes.

Looking forward for your participation to the course,
Adela Valceanu
Conference Officer
European Molecular Biology Laboratory
Meyerhofstr. 1
D-69117 Heidelberg

Labels: Chip-Seq

Saturday, February 14, 2009

Art for sale... totally un-related to science

When I started this blog, I had planned to use it to post my photographs and other art projects. Clearly that didn't work out how I expected it to. Anyhow, I thought I'd jump back into that mode for a brief (Weekend) post, and put up a picture of a painting I did a few years ago, and am now trying to sell.

I'm reasonably sure that no one will want it, but I think it'll be an interesting experiment to post it on craigslist and see if anyone is willing to pay for it. Hey, if Genome Canada doesn't get funding for a few more years, I might have another career to fall back on. (-;

Labels: Off topic

Collection of all of the AGBT 2009 notes

I've had several requests for a link to all of my notes from AGBT 2009, so - after some tweaking and relabeling - I've managed to come up with a single link to all of the AGBT postings. (There are a few very sparse postings from AGBT 2008, but they don't contain much information that's really useful.

Anyhow, if you'd like the link to all of my notes, you can find them here: http://www.fejes.ca/labels/AGBT%202009.html

Labels: Blog

Friday, February 13, 2009

Time for a new look

For people who read my blog on my web page, rather than through feeds, you might notice that my page looks different today. I had some feedback from other bloggers at AGBT (in particular, Daniel MacArthur of Genetic Future), who made some suggestions that should have been simple to implement. Unfortunately, my template is so customized, it's nearly impossible to make the changes without breaking the layout.

So, I figured the best thing to do is clean up and start from scratch. If you notice odd changes in the template, that would be why. (=

In the meantime, there's lots of good FindPeaks news - including controls, which are starting to work, new file formats, and a HUGE speed increase. (Whole genome runs in 6 minutes with a control... wow.)

Anyhow, I've got lots to do - and don't mind the blog template tinkering, 6 minutes at a time.

Labels: Blog

Wednesday, February 11, 2009

Epidemiology and next-generation(s) sequencing.

I had a very interesting conversation this morning with a co-worker, which ended up as a full fledged conversation about how next generation sequencing will end up spilling out of the research labs to the physician's office. My co-worker originally stated that it will take 20 years or so for it to happen, which seems kind of off to me. While most inventions take a lot longer to get going, I think that next-gen sequencing will cascade over more quickly to general use a lot more quickly than people appreciate. Let me explain why.

The first thing we have to acknowledge is that pharmaceutical companies have a HUGE interest in making next gen sequencing work for them. In the past, pharma companies might spend millions of dollars getting a drug candidate to phase 2 trials, and it's in their best interest to get every drug as far as they can. Thus, any drug that can be "rescued" from failing at this stage will decrease the cost of getting drugs to market, and increases revenues significantly for the company. With the price of genome sequencing falling to $5000/person, it wouldn't be unreasonable for a company to do 5-10,000 genomes for the phase 3 trial candidates, as insurance. If the drug seems to work well for a population associated with a particular set of traits, and not well for another group, it is a huge bonus for the company in getting the drug approved. If the drug causes adverse reactions in a small population of people which associate with a second set of traits, then it's even better - they'll be able to screen out adverse responders.

When it comes to getting FDA approval, any company that can clearly specify who the drug will work for - who it won't work for - and who shouldn't take it, will be miles ahead of the game, and able to fast track their application though the approval process. That's another major savings for the company.

(If you're paying attention, you'll also notice at least one new business model here: retesting old drugs that failed trials to see if you can find responsive sub-populations. Someone is going to make a fortune on this.)

Where does this meet epidemiology? Give it 5-7 years, and you'll start to see drugs appear on the shelf with warnings like "This drug is counter-indicated for patients with CYP450 variant XXXX." Once that starts to happen, physicians will really have very little choice but to start sending their patients for routine genetic testing. We already have PCR screens in the labs for some diseases and tests, but it won't be long before a whole series of drugs appear with labels like this, and insurance companies will start insisting that patients have their genomes sequenced for $5000, rather than have 40-50 individual test kits that each cost $100.

Really, though, what choice will physicians have? When drugs begin to show up that will help 99% of the patients for which they should be prescribed, but are counter indicated for genomic variations, no physician will be willing to accept the risk of prescribing without the accompanying test. (Malpractice insurance is good... but only gets you so far!) And as the tests get more complex, and our understanding of underlying cause and effect of various SNPs starts to increase, this is going to quickly go beyond the treatment of single conditions.

I can only see one conclusion: every physician will have to start working closely with a genetic councilor of some sort, who can advise on relative risk and reward of various drugs and treatment regimes. To do otherwise would be utterly reckless.

So, how long will it be until we see the effects of this transformation on our medical system? Well, give it 5 years to see the first genetic counter-indications, but it won't take long after that for our medical systems (on both sides of the border in North America) to feel the full effects of the revolution. Just wait till we start sequencing the genomes of the flu bugs we've caught to best figure out which anti-viral to use.

Gone are the days when the physician will be able to eye up his or her patient and prescribe whatever drug he or she comes up with off the top of their head. Of course, the hospitals aren't yet aware of this tsunami of information and change that's coming at them. Somehow, we need to get the message to them that they'll have to start re-thinking the way they treat people, instead of populations of people.

Labels: future, General Musings, Off topic

Monday, February 9, 2009

AGBT 2009 – Thoughts and reflections

Note: I wrote this last night on the flight home, and wasn't able to post it till now. In the meantime, I've gotten some corrections and feedback that I'll go through and make corrections to my blog posts as needed. In the meantime, here's what I wrote last night.

****

This was my second year at AGBT, and I have to admit that I enjoyed this year a little more than the last. Admittedly, it's probably because I knew more people and was more familiar with the topics being presented than I was last year. Of course, comprehensive exams and last year's AGBT meeting were very good motivators to come up to speed on those topics.

Still, there were many things this year that made the meeting stand out, for which the organizing committee deserves a round of applause.

One of the things that worked really well this year was the mix of people. There were a lot of industry people there, but they didn't take over or monopolize the meeting. The industry people did a good job of using their numbers to host open houses, parties and sessions without seeming "short-staffed". Indeed, there were enough of them that it was fairly easy to find them to ask questions and learn more about the “tools of the trade.”

On the other hand, the seminars were mainly hosted by academics – so it didn't feel like you were sitting through half hour infomercials. In fact, the sessions that I attended were all pretty decent, with a high level of novelty and entertainment factor. The speakers were nearly all excellent, with only a few that felt of “average” presentation quality. (I managed to take notes all the way through, so clearly I didn't fall asleep during anyone's talk, even if I had the momentary zone out caused by the relentless 9am-9pm talk schedule.)

At the end of last year's conference, I returned to Vancouver – and all I could talk about was Pacific Biosciences SMRT technology, which dominated the “major announcement” factor for me for the past year. At this year's conference, there were several major announcements that really caught my attention. I'm not sure if it's because I have a better grasp of the field, or if there really was more of the “big announcement” category this year, but either way, it's worth doing a quick review of some of the major highlights.

Having flown in late on the first day, I missed the Illumina workshop, where they announced the extension of their read length to 250 bp, which brings them up to the same range as the 454 technology platform. Of course technology doesn't stand still, so I'm sure 454 will have a few tricks up their sleeves. At any rate, when I started talking with people on thursday morning, it was definitely the hot topic of debate.

The second topic that was getting a lot of discussion was the presentation by Complete Genomics, which I've blogged about – and I'm sure several of the other bloggers will be doing in the next few days. I'm still not sure if their business model is viable, or if the technology is ideal... or even if they'll find a willing audience, but it sure is an interesting concept. The era of the $5000 genome is clearly here, and as long as you only want to study human beings, they might be a good partner for your research. (Several groups announced they'll do pilot studies, and I'll be in touch with at least one of them to find out how it goes.)

And then, of course, there was the talk by Marco Marra. I'm still in awe about what they've accomplished – having been involved in the project officially (in a small way) and through many many games of ping-pong with some of the grad students involved in the project more heavily, it was amazing to watch it all unfold, and now equally amazing to find out that they had achieved success in treating a cancer of indeterminate origin. I'm eagerly awaiting the publication of this research.

In addition to the breaking news, there were other highlights for me at the conference. The first of many was talking to the other bloggers who were in attendance. I've added all of their blogs to the links on my page, and I highly suggest giving their blogs a look. I was impressed with their focus and professionalism, and learned a lot from them. (Tracking statistics, web layout, ethics, and content were among a few of the topics upon which I received excellent advice.) I would really suggest that this be made an unofficial session in the future. (you can find the links to their blogs as the top three in my "blogs I follow" category.)

The Thursday night parties were also a lot of fun – and a great chance to meet people. I had long talks with people all over the industry, where I might not otherwise have had a chance to ask questions. (Not that I talked science all evening, although I did apologize several times to the kind Pacific Biosciences guy I cornered for an hour and grilled with questions about the company and the technology. And, of course, the ABI party where Olena got the picture in which Richard Gibbs has his arm around me is definitely another highlight. (Maybe next year I'll introduce myself before I get the hug, so he knows who I am...)

One last highlight was the panel session sponsored by Pacific Biosciences, in which Charlie Rose (I hope I got his name right) mediated the discussion on a range of topics. I've asked a guest writer to contribute a piece based on that session, so I won't talk too much about it. (I also don't have any notes, so I probably shouldn't talk about it too much anyhow.) It was very well done with several controversial topics being raised, and lots of good stones were turned over. One point is worth mentioning, however: One of the panel guests was Eric Lander, who has recently come to fame in the public's eye for co-chairing a science committee requested by the new U.S. President Obama. This was really the first time I'd seen him in a public setting, and I have to admit I was impressed. He was able to clearly articulate his points, draw people into the discussion and dominate the discussion while he had the floor, but without stifling anyone else's point of view. It's a rare scientist who can accomplish all of that - I am now truly a fan.

To sum up, I'm very happy I had the opportunity to attend this conference and looking forward to see what the next few years bring. I'm going back to Vancouver with an added passion to get my work finished and published, to get my code into shape, and to keep blogging about a field going through so many changes.

And finally, thanks to all of you who read my blog and said hi. I'm amazed there are so many of you, and thrilled that you take the time to stop by my humble little corner of the web.

Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Wednesday, February 25, 2009

Microsoft Sues TomTom over patents

Friday, February 20, 2009

Gairdner Symposium

Wednesday, February 18, 2009

Three lines of Java code you probably don't want to write

Tuesday, February 17, 2009

FindPeaks 3.3

Monday, February 16, 2009

EMBL Advanced Course in Analysis of Short Read Sequencing Data

Saturday, February 14, 2009

Art for sale... totally un-related to science

Collection of all of the AGBT 2009 notes

Friday, February 13, 2009

Time for a new look

Wednesday, February 11, 2009

Epidemiology and next-generation(s) sequencing.

Monday, February 9, 2009

AGBT 2009 – Thoughts and reflections

Saturday, February 7, 2009

Stephan Schuster, Penn State University - “Genomics of Extinct and Endangered Species”

Len Pennacchio, Lawrence Berkely National Laboratory - “ChIP-Seq Accurately Predicts Tissue Specific Enhancers in Vivo”

Bruce Budowle, Federal Bureau of Investigation - “Detection by SOLiD Short-Read Sequencing of Bacilus Anthracis and Tersinia Pestis SNPs for Strain Id

Andy Fire, Stanford University - “Understanding and Clinical Monitoring of Immune-Related Illnesses Using Massively-Parallel IgH and TcR Sequencing”

Keynote Speaker: Rick Wilson, Washington University School of Medicine - “Sequencing the Cancer Genome”

Peter Park, Harvard Medical School - “Statistical Issues in ChIP-Seq and its Application to Dosage Compensation in Drosophila”

Alex Meissner, Harvard University- “From reference genome to reference epigenome(s)”

Marco Marra's Talk

**BREAKING NEWS** Marco Marra, BC Cancer Agency - “Sequencing Cancer Genomes and Transcriptomes: From New Pathology to Cancer Treatment.”

Keynote Speaker: Rick Myers, Hudson-Alpha Institute - “Global Analysis of Transcriptional Control in Human Cells”

Friday, February 6, 2009

Kevin McKernan, Applied Biosystems - "The whole Methylome: Sequencing a Bisulfite Converted Genome using SOLiD"

Stephen Kingsmore, National Centre for Genome Resources - “Digital Gene Expression (DGE) and Measurement of Alternative Splice Isoforms, eQTLs and cSN

Jesse Gray, Harvard Medical School - “Neuronal Activity-Induced Changes in Gene Expression as Detected by ChIP-Seq and RNA-Seq”

Terrence Furey, Duke University - “A Genome-Wide Open Chromatin Map in Human Cell Types in the ENCODE Project”

Kai Lao, Applied Biosystems - “Deep Sequencing-Based Whole Transcriptome Analysis of Single Early Embryos”

Matthew Bainbridge, Baylor College of Medicine - “Human Variant Discovery Using DNA Capture Sequencing”

Complete Genomics, part 2

Keynote: Richard Gibbs, Baylor College of Medicine - “Genome Sequencing to Health and Biological Insight”

John Todd, University of Cambridge - “The Identification of Susceptibly Genes in Common Diseases Using Ultra-Deep Sequencing”

Kathy Hudson, The Johns Hopkins University - “Public Policy Challenges in Genomics”

Howard McLeod, University of North Carolina, Chapel Hill - “Using the Genome to Optimize Drug Therapy”

Keynote Speaker: Kari Sefansson, deCODE Genetics - “Common/Complex traits with emphasis on disease”

Site Feed changes

Thursday, February 5, 2009

Complete Genomics

Erin Pleasance, Welcome Trust Sanger Institute - “Whole Cancer Genome Sequencing and Identification of Somatic Mutations”

Christopher Maher, University of Michigan - “Integrative Transcriptome Sequencing to Discover Gene Fusion in Cancer”

Anna Kiialainen, Uppsala University - “Identification of Regulatory Genetic Variation That Affects Drug Response in Childhood Acute Lymphoblastic Leuk

David Dooling, Washington University School of Medicine, “Next-Generation Informatics”

Pacific Biosciences - Steven Turner " Applying Single Molecule Real Time DNA Sequencing"

A quick note...

Adam Siepel, Biologcal statistics & Computational biology – Comparative Analysis of 2x genomes: Progress, challenges and opportunities.

Jeff Rogers, Baylor Human Genome Sequencing Center - “Linking the fossil record with comparative primate genomics”

Phil Stephens, Sturctural Somatic Genomics of Cancer

Tom Hudson – Ontario Institute of Cancer Research “Genome Variation and Cancer”

Keynote Speaker: Eddy Rubin, Joint Genome Institute - “Genomics of Cellulosic Biofuels”

Oyster genome...

Jun Wang, Beijing Genomics Institute at Shenzhen - “Sequencing, Sequencing and Sequencing”

Les Biesecker, NHGRI/NIG - “ClinSeq: Piloting Large-Scale Medical Sequencing for Translational Genomics Research”

Mike Kozal, Yale University, “Ulta Deep Sequencing and Other Genotyping Technologies to Detect Low-Abundance Drug-Resistant Viral Variants”

Eric Lander, Broad Institute of Harvard and MIT – The New World of Genome Sequencing

10 years of AGBT

Tuesday, February 3, 2009

Countdown to AGBT

Monday, February 2, 2009

two steps forward to realize you haven't gone anywhere...

About Me

On fejes.ca:

Posters

My Projects:

FindPeaks

Elsewhere:

Blogs I follow

Forums

News

Hockey

Comics & Humour

Photography

BREAKING NEWS Marco Marra, BC Cancer Agency - “Sequencing Cancer Genomes and Transcriptomes: From New Pathology to Cancer Treatment.”