coverting gtf to bed
0
0
Entering edit mode
8.4 years ago
zizigolu ★ 4.3k

Hi,

Based on previous biostars post about converting gtf to bed, I converted my gtf file to bed like below

gtf2bed < merged.gtf > merged.bed

I also tried

gtf2bed < merged.gtf > unsorted-foo.gtf.bed

But unreadable anyway. But the resulted file is unreadable by kate or all, do you know why please?

bedops gtf bed • 5.1k views
ADD COMMENT
1
Entering edit mode

I assume you are getting something in the converted file since you say it is unreadable? Can you post a few lines from the top of the converted file?

ADD REPLY
0
Entering edit mode

but i cant open the converted file and i tried to open the file with kate the file was empty

ADD REPLY
1
Entering edit mode

Did you look at merged.gtf to make sure that looked reasonable since the conversion is not working?

ADD REPLY
0
Entering edit mode

thank you

this is the first column of merged gtf

1    Cufflinks    exon    3631    3913    .    +    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "NAC001"; oId "AT1G01010.1"; nearest_ref "AT1G01010.1"; class_code "="; tss_id
ADD REPLY
1
Entering edit mode

If you just need a minimal bed file you could just do following. Record separator is a space in the command below. Replace it with a \t (e.g. -F "\t"), if that is what is in your file.

$ awk -F " " 'BEGIN {OFS="\t"}{print $1,$4,$5}' merged.gtf > merged.bed
ADD REPLY
0
Entering edit mode

thank you so much,

first I was working with prokaryotes then I used bowties as an aligner and downstream tool, finally this is the first row of my bed file

YAL001C    0    31    SRR1944914.13670510    42    +

but now I am working with eukaryotes then I used tophat as an aligner and cufflinks package that I have finally a merged.gtf by cuffdiff that the first row

1    Cufflinks    exon    3631    3913    .    +    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "NAC001"; oId "AT1G01010.1"; nearest_ref "AT1G01010.1"; class_code "="; tss_id

my main purpose of these analysis is having how many reads are being mapped on a gene and in which distance...then gene_name in merged.gtf showing the times genes repeating as the number of reads are being mapped also columns 4th and 5th in merged.gtf showing the distances these reads are mapped..anyway I am using biophysconnector package that needs a bed file, then I thought to convert my gtf.file to bed

ADD REPLY

Login before adding your answer.

Traffic: 1936 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6