Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Friday, May 1, 2009

2 weeks of neglect on my blog = great thesis progress.

I wonder if my blogging output is inversely proportional to my progress on my thesis. I stopped writing two weeks ago for a little break, and ended up making big steps forward. The vast amount of my work went into FindPeaks, which included the following:

  • A complete threaded Saturation analysis for next-gen libraries.

  • A method of comparing next-gen libraries to identify peaks that are statistically significant outliers. (It's also symmetic, unlike a linear regression based methods.)

  • A better control method

  • A whole new way of analysing WTSS data, which gives statistically valid expression differences


And, of course many many other changes. Not everything is bug-free, yet, but it's getting there. All that's left on my task list are debugging a couple of things in the compare mode, relating to peaks present in only one of the two librarires, and an upgrade to my FDR cutoff prediction methods. Once those are done, I think I'll be ready to push out FindPeaks 4.0. YAY!

Actually, what was surprising to me was the sheer amount of work that I've done on this since January. I compiled the change list since my last "quarterly report" for a project that used FindPeaks (but doesn't support it, ironically.... why am I doing reports for them again?) and came up with 24 pages of commit messages - over 575 commits. Considering the amount of work I've done on my actual projects, away from FindPeaks, I think I've been pretty darn productive.

Yes, I'm going to take this opportunity to pat myself on the back in public for a whole 2 seconds... ok, done.

So, overall, blogging may not have been distracting me from my work, as even at the height of my blogging (around AGBT), I was still getting lots done, but the past two weeks have really been a help. I'll be back to blogging all the good stuff on monday. And I'm looking forward to doing some writing now, on some of the cool things in FP4.0 that haven't made it into the manual... yet.

Anyone want some fresh ChIP-Seq results? (-;

Labels: , , , ,

Monday, June 16, 2008

Random Update on FP/Coding/etc.

I had promised to update my blog more often, but then failed miserably to follow through last week. I guess I have to chalk it up to unforeseen circumstances. On the bright side, it gave me the opportunity to come up with several things to discuss here.

1. Enerjy: I learned about this tool on Slashdot, last week while doing some of my usual lunch hour "open source news" perusal. I can unequivocally say that installing the Enerjy tool in Eclipse has improved my coding by unimaginable leaps and bounds. I tested it out on my Java codebase that has my FindPeaks application and the Transcriptome/Genome analysis tools, and was appalled by the number of suggestions it gave. Admittedly, I'm self taught in Java, but I thought I had grasped the "Zen" of Java by now, though the 2000+ warnings it gave disagreed. I've since been cleaning up the code like there's no tomorrow, and have brought it down to 533 warnings. The best part is that it pointed out several places where bugs were likely to have occurred, which have now all been cleaned up.

2. Threading has also come up this past week. Although I didn't "need" it, there was no way to get around it - learning threads was the appropriate solution to one problem that came up, so my development version is now beginning to include some thread management, which is likely to spread into the the core algorithms. Who knew??

3. Random politics: If you're a grad student in a mixed academic/commercial environment, I have a word of warning for you: Not everyone there is looking out for your best interests. In fact, some people are probably looking out for their own interests, and they're definitely not the same as yours.

4. I read Michael Smith's biography this week. I was given a free copy by the Michael Smith Foundation for Health Research, who were kind enough to provide my funding for the next few years. It's fantastic to understand a lot of the history behind the British Columbia Biotechnology scene. I wish I'd read that before having worked at Zymeworks. That would have provided me with a lot more insight into the organizations and people I met along the way. Hindsight is 20/20.

5. FindPeaks 4.0: Yes, I'm skipping plans for a FindPeaks 3.3. I've changed well over 12000+ lines of code, according to the automated scripts that report such things, which have included a major refactoring and the start I made at threading. If that doesn't warrant an major number version change, I don't know what does.

Well, on that note, back to coding... I'm going to be competing with people here, in the near future, so I had best be productive!

Labels: , , , , ,