GFF3 undocumented feature...
Earlier today, I tweeted:
The attribute "genotype" contains an IUB code that is limited to using either a single base or a double base annotation (eg, it should not contain, H, B, V, D or N - but may contain R, Y, W, S, M or K ), which then allows you to subtract the "reference" attribute (that must be canonical) from the "genotype" attribute IUB code to obtain the new SNP - but only when the "genotype" attribute is not a canonical base.
If only that were documented somewhere...
UPDATE: Actually, this turns out not to be the case at all -- there are still positions for which the "genotype" attribute is an IUB code, and the reference is not one of the called bases. DOH!
Does anyone know how to decypher a diBase GFF3 file? They don't identify the "most abundant" nucleotide uniquely. seems useless to me.Apparently, there is a solution, albeit undocumented:
The attribute "genotype" contains an IUB code that is limited to using either a single base or a double base annotation (eg, it should not contain, H, B, V, D or N - but may contain R, Y, W, S, M or K ), which then allows you to subtract the "reference" attribute (that must be canonical) from the "genotype" attribute IUB code to obtain the new SNP - but only when the "genotype" attribute is not a canonical base.
If only that were documented somewhere...
UPDATE: Actually, this turns out not to be the case at all -- there are still positions for which the "genotype" attribute is an IUB code, and the reference is not one of the called bases. DOH!
Labels: formats, SNP calling, SNP Database, SNPS
0 Comments:
Post a Comment
<< Home