Question: Gff2Bed.Py Stops
gravatar for GPR
7.8 years ago by
GPR320 wrote:

Hello, I am trying to convert a genes.gtf files into a bed one. For this I am using the "bedops" script gff2bed. My command line is ./gff2bed genes.gtf | ./sort-bed -> genes.bed Its does start but right after I get a "stopped" message. The same happens if I execute only ./gff2bed genes.gtf > genes.bed I have also converted a vcf file into a bed one with the command-line ./ < input.vcf | ./sort-bed -> vcf.bed This works fine, but when I run / < input.vcf -> vcf.bed I get the "stopped" message again. Does anyone know what's wrong with this? Thanks, G.

ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by GPR320
gravatar for Alex Reynolds
7.8 years ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

The BEDOPS gff2bed script converts GFF3 files, not GTF files. I haven't verified that gff2bed can work on generic GTF files. While the two formats are related, they do differ and you should use (for example) Bill Noble's gtf2bed conversion script to convert GTF files.

Are you running < input.vcf -> vcf.bed? That hyphen would be a typo, probably — I'm not sure what that would do. You would want to run: < input.vcf | sort-bed - > vcf.bed to ensure the BED output is sorted.

If you really want to skip sorting (let's say that your VCF elements are guaranteed to be lexicographically-sorted), then use < input.vcf > vcf.bed.

Note that it is not guaranteed that BEDOPS tools can provide a correct answer on unsorted BED inputs, so it might be better to use the BEDOPS sort-bed application, just to be safe and prevent GIGO errors. Or add the --ec option when using bedmap or other BEDOPS applications, which enables error-checking (though it is usually much faster to just use sort-bed to pre-sort the input — you only have to sort once, as BEDOPS apps read in and write out sorted data).

If you still get errors, feel free to post a transcript of what you're doing on the BEDOPS forum, along with links to your GTF and/or VCF inputs.

Also consider running your data through validation scripts — we've uncovered problems in the past with labs releasing data that do not follow specification, which requires extra pre-processing steps to clean up input before it is consumed by our scripts and applications. We've also uncovered problems with our assumptions about specifications and how that translates to a script or application. Biology is messy, but bioinformatics is messier.

ADD COMMENTlink modified 10 months ago by RamRS30k • written 7.8 years ago by Alex Reynolds31k

OK got it!. The click was to download a RefFlat-genes.bed file from UCSC, then sort it with "sort-bed" and use this in "bedmap". Thanks so much for your help! G.

ADD REPLYlink written 7.8 years ago by GPR320

Since answering this question, we have added a GTF-to-BED conversion script to the BEDOPS suite:

ADD REPLYlink written 7.7 years ago by Alex Reynolds31k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1032 users visited in the last hour