Question: kissplice2reftranscriptome interpreting output
gravatar for rares_lucaciu
11 weeks ago by
rares_lucaciu0 wrote:


regarding the kissplice2reftranscriptome main output: for eg: I have: TRINITY_DN921_c0_g2_i1 bcc_10004|Cycle_0|Type_0a True 100.0 202 TAT CAT ... As is understandable the TAT is the reference codon and CAT is the alt. codon, and the "TAT" should be the 202-204 squences in the TRINITY_DN921_c0_g2_i1 sequence.

So I did this: samtools faidx 02_Trinity.fasta TRINITY_DN9185_c0_g2_i1:202-204

TRINITY_DN921_c0_g2_i1:202-204 TAA

In other cases for eg.: TRINITY_DN921_c0_g2_i1 bcc_10003|Cycle_0|Type_0a True 100.0 641 GGG GGC

TRINITY_DN9185_c0_g2_i1:639-641 GGC

Can somebody explain me what is the pattern there? Or how to find the exact position of the codon into the transcript? Thank you,

ADD COMMENTlink modified 6 weeks ago by vincent.lacroix80 • written 11 weeks ago by rares_lucaciu0
gravatar for vincent.lacroix
10 weeks ago by
vincent.lacroix80 wrote:

Hi Lucaciu,

All the formats we use in our pipeline (.bed, .psl) are 0-based, hence the SNP position we output in the final table is also 0-based. If you want to use samtools faidx (which is 1-based), you should type :

samtools faidx 02_Trinity.fasta TRINITY_DN9185_c0_g2_i1:201-205

You will obtain 5 nucleotides, the central position being the SNP (202 in 0-based is 203 in 1-based).

For your specific example, since the SNP is in the first position of the codon, the codon should correspond to the last 3 nt of these 5nt, unless your ORF is on the minus strand, in which case your codon should correspond to the reverse complement of the first 3 nt.



ADD COMMENTlink modified 10 weeks ago • written 10 weeks ago by vincent.lacroix80

Hi Vincent,

I'm also having a hard time interpreting the output. I was wondering if you could give me a hand.

When the snp change is in the second position of the codon, then we can predict that the codon of interest is in the middle of the 5 nucleotides that you mentioned. But how can we find the position of change in the reference fasta file file when the snp is on the first or second position of the codon? Also, how would this read for the reverse complement?

I'm copying below some selected sections of my data.

TrinityID-Position-samtoolsfaidx #reslults-faidx #results-faidx-complemented #kissplice-position #Kissplice-codon1 #Kissplice-codon2 #SNP_position_change

TRINITY_DN14342_c0_g1_i1:778-782 AGGAA TTCCT 779 GAA GGA 2nd TRINITY_DN19222_c5_g1_i4:1331-1335 CCCGC GCGGG 1332 CCC CCT 3rd TRINITY_DN5938_c0_g1_i1:1977-1981 TCCGA TCGGA 1978 TCC TCT 3rd TRINITY_DN14441_c0_g3_i1:41-45 CGGCC GGCCG 42 CGA CGG 3rd TRINITY_DN19222_c5_g1_i4:1232-1236 GAGAA TTCTC 1233 GAG GAA 3rd TRINITY_DN14418_c0_g1_i1:955-959 GGGAA TTCCC 956 GGA GGG 3rd TRINITY_DN14441_c0_g3_i1:40-44 TCGGC GCCGA 41 CGA CAA 2nd
TRINITY_DN19222_c5_g1_i4:1288-1292 CAGAG CTCTG 1289 AAA AGA 2nd TRINITY_DN8529_c0_g1_i1:134-138 GCGCT AGCGC 135 CAT CGT 2nd

Best, Vanessa

ADD REPLYlink written 7 weeks ago by vguerracanedo10
gravatar for vincent.lacroix
6 weeks ago by
vincent.lacroix80 wrote:

Hi Vanessa,

I am not sure I understand your question. The column SNP_position of the main output file of k2rt should give you the position of the SNP in the reference. The only tricky thing to remember is that this coordinate is 0-based.

If you simply need the position of the SNP in the reference, then you do not need to worry about the strand, or if the SNP is in the first, second or third position of the codon.

Or maybe you need something else ?


ADD COMMENTlink written 6 weeks ago by vincent.lacroix80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1620 users visited in the last hour