Question: Parser For Converting Ucsc Tables Into Gff3 Format
0
gravatar for ChIP
5.4 years ago by
ChIP490
Netherlands
ChIP490 wrote:

Hi!

I have the sample data as shown below from UCSC table browser:

#bin    name    chrom    strand    txStart    txEnd    cdsStart    cdsEnd    exonCount    exonStarts    exonEnds    score    name2    cdsStartStat    cdsEndStat    exonFrames
0    NM_032291    chr1    +    66999824    67210768    67000041    67208778    25    66999824,67091529,67098752,67101626,67105459,67108492,67109226,67126195,67133212,67136677,67137626,67138963,67142686,67145360,67147551,67154830,67155872,67161116,67184976,67194946,67199430,67205017,67206340,67206954,67208755,    67000051,67091593,67098777,67101698,67105516,67108547,67109402,67126207,67133224,67136702,67137678,67139049,67142779,67145435,67148052,67154958,67155999,67161176,67185088,67195102,67199563,67205220,67206405,67207119,67210768,    0    SGIP1    cmpl    cmpl    0,1,2,0,0,0,1,0,0,0,1,2,1,1,1,1,0,1,1,2,2,0,2,1,1,
1    NM_032785    chr1    -    48998526    50489626    48999844    50489468    14    48998526,49000561,49005313,49052675,49056504,49100164,49119008,49128823,49332862,49511255,49711441,50162984,50317067,50489434,    48999965,49000588,49005410,49052838,49056657,49100276,49119123,49128913,49332902,49511472,49711536,50163109,50317190,50489626,    0    AGBL4    cmpl    cmpl    2,2,1,0,0,2,1,1,0,2,0,1,1,0,
1    NM_018090    chr1    +    16767166    16786584    16767256    16785385    8    16767166,16770126,16774364,16774554,16775587,16778332,16782312,16785336,    16767348,16770227,16774469,16774636,16775696,16778510,16782388,16786584,    0    NECAP2    cmpl    cmpl    0,2,1,1,2,0,1,2,
1    NM_052998    chr1    +    33546713    33585995    33547850    33585783    12    33546713,33546988,33547201,33547778,33549554,33557650,33558882,33560148,33562307,33563667,33583502,33585644,    33546895,33547109,33547413,33547955,33549728,33557823,33559017,33560314,33562470,33563780,33583717,33585995,    0    ADC    cmpl    cmpl    -1,-1,-1,0,0,0,2,2,0,1,0,2,
1    NM_001145278    chr1    +    16767166    16786584    16767256    16785385    8    16767166,16770126,16774364,16774554,16775587,16778332,16782312,16785336,    16767270,16770227,16774469,16774636,16775696,16778510,16782388,16786584,    0    NECAP2    cmpl    cmpl    0,2,1,1,2,0,1,2,
1    NM_001145277    chr1    +    16767166    16786584    16767256    16785491    7    16767166,16770126,16774364,16774554,16775587,16778332,16785336,    16767348,16770227,16774469,16774636,16775696,16778510,16786584,    0    NECAP2    cmpl    cmpl    0,2,1,1,2,0,1,
1    NM_001080397    chr1    +    8384389    8404227    8384389    8404073    8    8384389,8385357,8385877,8390268,8395496,8397875,8399552,8403806,    8384786,8385450,8386102,8390996,8395650,8398052,8399758,8404227,    0    SLC45A1    cmpl    cmpl    0,1,1,1,0,1,1,0,
1    NM_013943    chr1    +    25071759    25170815    25072044    25167428    6    25071759,25124232,25140584,25153500,25166350,25167263,    25072116,25124342,25140710,25153607,25166532,25170815,    0    CLIC4    cmpl    cmpl    0,0,2,2,1,0,

I would like to convert this into GFF3 format.

Does anybody knows how it can be done, or has something which can be used...something in bash or python or perl ????

Thank you

genome python • 1.7k views
ADD COMMENTlink modified 5.4 years ago by Sean Davis25k • written 5.4 years ago by ChIP490

Why don't you just export the ucsc table in gtf format directly?

ADD REPLYlink written 5.4 years ago by Irsan6.8k

Because it misses the set of information that I need, hence, I am first taking it in GenPred format :) .... and that set of information is the the gene name

ADD REPLYlink modified 5.4 years ago • written 5.4 years ago by ChIP490
1
gravatar for Sean Davis
5.4 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

See here for some ideas:

http://genomewiki.ucsc.edu/index.php/Genes_in_gtf_or_gff_format

ADD COMMENTlink written 5.4 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 865 users visited in the last hour