Making bed files from fasta
6.2 years ago
peri.tobias ▴ 10

Trying to run a biopython script in Windows cmd to make a bed file from a draft genome downloaded from NCBI. I get the following error. The headers appear to be fine. I have used the biopython script many times with success previously. Can someone see what the error is please?

C:\Python34>python.exe test.fasta make_bed_from_fasta.py >test.bed
File "test.fasta", line 1
>KQ503367.1
^
SyntaxError: invalid syntax


The fasta looks like this, top lines. PS.I am not sure why the post is edited to remove the ">" but it is present in the header like this ">KQ503367.1" with sequence on following line.

>KQ503367.1
TGGAAAATTTGGTttgtaattctttttctaaaaaaaacttattttggGGTGTATGATGTGGGTTATTTGGGAGGGGTGAG
AAAAAGTGTGAAACAAATGGTTGAAGggtttttggaagttttttttccaaatacaggttttttgtttcattttaatttaa
aatgggcCTGGGGAAacccttacatgtttttaccaaattggTTAGGTGGGTTTACCAAAGCCCTAAATTGATTAGAACTt

6.2 years ago
peri.tobias ▴ 10

My apologies, my biopython script was corrupted. The fasta is okay.

6.2 years ago

I think you've already figured out that you need to pass the python script as the first argument instead of the fasta file...

Thanks Matt. Yes I should have looked a bit closer before posting.

6.0 years ago

I guess I'll also mention that you could use pyfaidx for this as well:

$pip install pyfaidx$ faidx --transform bed test.fasta > test.bed

I want to point out that this feature didn't work as intended until pyfaidx v0.5.2, where someone pointed out that the coordinate weren't 0-based half-open as expected. This has now been fixed: https://github.com/mdshw5/pyfaidx/releases/tag/v0.5.2