Question: More Details About Txdb
0
gravatar for skm770
6.5 years ago by
skm770150
skm770150 wrote:

Hi I need to find out which genes have only one skipping exons in the annotated database known as TXdb is there any way we can get more information about TXdb. Splicetrap tool comes with a database of txdb hg19 which has an evidence file known as TXdb.evi But I need to know more details about the TXdb evi file before I can interpret it correctly Here are a few lines from the Txdb.evi file

AA-AA-1-100316530-100316590.0[S] NM_000644,ENST00000370165,A3-178-3535-4041[A1][5/5][UPT],uc001dsl.1, AA-AA-1-100605963-100606005.0[L] uc001dsv.1,ENST00000370141,uc009wea.1,CA-54482-9562-10032-10102-10258[INC][36/2],A3-54482-10102-10365[A0][25/7],NM_019083, AA-AA-1-100605963-100606005.0[S] A3-54482-10102-10365[A1][25/7], AA-AA-1-100606407-100606522.0[L] A3-54482-10365-10870[A0][13/6],uc001dsv.1,ENST00000370141,uc009wea.1,NM_019083, AA-AA-1-100606407-100606522.0[S] A3-54482-10365-10870[A1][13/6], AA-AA-1-100613449-100613648.0[L] uc001dsv.1,ENST00000370141,CA-54482-13994-17744-18177-18475[INC][4/2][DNT],NM_019083,A3-54482-13994-18177[A0][4/2], AA-AA-1-100613449-100613648.0[S] A3-54482-13994-18177[A1][4/2], AA-AA-1-100634150-100634190.0[L] ENST00000342895,ENST00000370136, AA-AA-1-100634150-100634190.0[S] CA-127495-3081-7845-7885-12621[SKIP][27/4][UPT],uc001dsx.1, AA-AA-1-100733629-100733707.0[L] A3-8634-3467-5138[A0][4/9/108],ENST00000498617, AA-AA-1-100733629-100733707.0[S] ENST00000370128,uc001dtc.1,CA-8634-3467-4342-4381-4994[SKIP][9/109],NM_003729,A3-8634-3467-5138[A2][4/9/108], AA-AA-1-100733692-100733707.0[L] A3-8634-3467-5138[A1][4/9/108], AA-AA-1-101342417-101342420.0[L] ENST00000535414,NM_001439,A3-2135-20387-21069[A0][42/3],uc001dtl.1,ENST00000370113,NM_001033025,ENST00000450240,ENST00000370114,uc001dtk.1, AA-AA-1-101342417-101342420.0[S] uc001dtm.1,A3-2135-20387-21069[A1][42/3], AA-AA-1-101354384-101354420.0[L] NM_001439,uc001dtk.1, AA-AA-1-101354384-101354420.0[S] uc001dtm.1,uc001dtl.1,NM_001033025, AA-AA-1-101354384-101354420.1[L] A3-2135-3230-9110[A0][38/81][UPT], AA-AA-1-101354384-101354420.1[S] A3-2135-3230-9110[A1][38/81][UPT], AA-AA-1-101456184-101456187.0[L] A3-51611-36170-38736[A0][5/3][DNT],CA-51611-33697-36066-36170-38175[INC][45/4][DNT], AA-AA-1-101456184-101456187.0[S] A3-51611-36170-38736[A1][5/3][DNT],CA-51611-33697-36066-36170-38178[INC][7/2][DNT], AA-AA-1-101456184-101456187.2[L] uc001dtz.1,NM_001077394,uc001dty.1,uc001dtu.1,NM_015958,uc001dtt.1,uc001dtv.1,uc001dtw.1,uc001dts.1, AA-AA-1-101456184-101456187.2[S] uc001dtr.1,NM_001077395, AA-AA-1-101456184-101456187.3[L] CA-51611-33697-36066-36170-38175[SKIP][45/4][DNT], AA-AA-1-101456184-101456187.3[S] CA-51611-33697-36066-36170-38178[SKIP][7/2][DNT], AA-AA-1-101467100-101467143.0[S] uc001dtu.1, AA-AA-1-101467100-101467143.1[L] ENST00000477293, AA-AA-1-101467100-101467143.1[S] uc001dty.1,ENST00000498372,ENST00000464270, AA-AA-1-101540831-101540950.0[L] uc001dua.2,ENST00000421013, AA-AA-1-101540831-101540950.0[S] ENST00000454721, AA-AA-1-1019391-1019466.0[S] ENST00000434641, AA-AA-1-1025808-1025967.0[L] ENST00000473600,A3-54991-27885-29004[A0][3/88], AA-AA-1-1025808-1026754.0[L] ENST00000477196, AA-AA-1-1026920-1026945.0[L] uc009vju.1,ENST00000467751,ENST00000379339,uc001acu.2,ENST00000294576,uc001acr.2,uc001act.2,ENST00000427787,ENST00000379319,NM_017891,ENST00000482816,ENST00000477196,A3-54991-27366-27885[A1][3/93/2],ENST0 0000442117,uc001acm.2,ENST00000448924,uc001acs.2,uc001acp.2,ENST00000462097,ENST00000379325,ENST00000421241,ENST00000437760,ENST00000434641,

There are lot of naming conventions and short forms that I am not aware of and I need someone's help who has already worked with TXdb before. I need for my analysis genes which only have one exon skipping event In this file exon skipping event is shown by CA and there are two annotations that accompany CA in square brackets namely INC/SKIP which might mean included or skipped there are also other annotations and fractions present in the file that I don't know how to interpret. e.g. what does uc00 etc means also what does [DNT] means and in the first column what does [L] or [S] means in square brackets.

And then there are cases which have only one exon skipping event and only have one exon e.g :

AA-AA-1-101456184-101456187.3[L] CA-51611-33697-36066-36170-38175[SKIP][45/4][DNT], AA-AA-1-101456184-101456187.3[S] CA-51611-33697-36066-36170-38178[SKIP][7/2][DNT],

I would very much appreciate if any one can explain or provide me any publication or resource where I can find further details and suggest me what is the best way to get a list of genes which show only one exon skipping event.

• 1.5k views
ADD COMMENTlink written 6.5 years ago by skm770150

I am not familiar with TXdb but if you explain me how you want to parse that big textfile, I can write the script for you.

ADD REPLYlink written 6.5 years ago by Biomonika (Noolean)3.1k

Scripting is not a problem its just about understanding the txdb format a little better. I have looked it up on pubmed but i dont seem to find information about it.There is an R package for TXDB and I would really appreciate to get help from someone who has already worked on txdb or R package for txdb

ADD REPLYlink written 6.5 years ago by skm770150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1181 users visited in the last hour