Question: Bedtools getfasta outputting a blank file
gravatar for mshumph2
5.9 years ago by
United States
mshumph20 wrote:


I'm using bedtools getfasta to get a bunch of sequences from chromosome 1. I have "chr1.fa" (from UCSC Genome Browser) as the input fasta file, and I have a BED file with chromosome location, start, stop, and name columns. My input looks like this: bedtools getfasta -fi chr1.fa -bed bedfile.bed -fo testing.fa.out -name because I'd like to organize the sequences by name.

The problem is this: when I run this command I don't get any errors, it just outputs a blank file with whatever name I gave it (in this case testing.fa.out). The problem may come down to this: I was given an excel spreadsheet with coordinates on it and I simply saved the file as tab-delimited text format. I copied out the three relevant columns- chrom, start, and stop- and put them into a new spreadsheet before saving it as a tab-delimited text file. Then I gave the columns each a name. It looks like in the tab-delimited text file the "tabbing" is different for the first 100 or so lines; the distance between columns is shorter. Then, later, the spaces between the columns become wider. If this is the problem, how can I fix this? I'm on a Mac, if that's relevant information.


sequence • 3.2k views
ADD COMMENTlink modified 11 months ago by ann-katrin.llarena0 • written 5.9 years ago by mshumph20

can you do a head on your bedfile and show us how it looks?

ADD REPLYlink written 5.9 years ago by komal.rathi3.6k

I'm not sure what a head is, but this is the format it's in. As you can see the format changes a few coordinates down. Also, copying and pasting changes the spacing between the columns.


chr1    9885764    9885814    chr1:9885764-9885814
chr1    9903769    9903819    chr1:9903769-9903819
chr1    9903769    9903819    chr1:9903769-9903819
chr1    10040879    10040929    chr1:10040879-10040929
chr1    10040879    10040929    chr1:10040879-10040929
chr1    10105721    10105771    chr1:10105721-10105771
chr1    10105721    10105771    chr1:10105721-10105771
chr1    10105721    10105771    chr1:10105721-10105771
chr1    10105721    10105771    chr1:10105721-10105771
chr1    10511188    10511238    chr1:10511188-10511238
chr1    10511188    10511238    chr1:10511188-10511238
chr1    10511188    10511238    chr1:10511188-10511238

ADD REPLYlink written 5.9 years ago by mshumph20

One issue I can see immediately is that your "start" column is off by one. BED coordinates are [0, 1) meaning 0-based start, one-based end coordinates. Ex first 100 bases of chr1 would be: chr1 0 100


ADD REPLYlink written 5.9 years ago by Matt Shirley9.4k

Don't worry about the TAB character representation. The display of TAB characters will not seem consistent, but the important thing is that there is not a mixture of TAB and SPACE.

ADD REPLYlink written 5.9 years ago by Matt Shirley9.4k

Also, "head" is a program on Unix systems that displays the first n lines of a file.

ADD REPLYlink written 5.9 years ago by Matt Shirley9.4k

Hi , I can see that this post is older than wood, but I have the exact same issue, even down to mac making excel. Did you figure out some solution for this=?

ADD REPLYlink written 11 months ago by ann-katrin.llarena0

Dear ann-katrin, as there's no solution and OP hasn't been active ever since, you'll be better off creating a new question with your detailed problem. In case this thread has the exact same problem, you can reference it.

To provide a minimum help, Mac, Windows and Unix use different line endings to encode a line break. Mac uses carriage return characters (\r) while Unix uses newline characters (\n). Excel usually saves text files using the operating system's settings. Many Unix tools expect Unix line breaks, and if they get something different, they fail with what seems to be bizarre warnings/results. To the software it sometimes looks like the entire input is a single line.

ADD REPLYlink modified 11 months ago • written 11 months ago by Carambakaracho2.2k
gravatar for Matt Shirley
5.9 years ago by
Matt Shirley9.4k
Cambridge, MA
Matt Shirley9.4k wrote:

I'm not quite sure what your issue might be, but you can also do this using the "--bed" option of the "faidx" utility included in the pyfaidx module. 


pip install pyfaidx OR easy_install pyfaidx


faidx chr1.fa -b bedfile.bed > regions.fa

As the author of pyfaidx I can tell you that I've tried to add helpful error messages that might give you a hint about any issues with your files.

ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by Matt Shirley9.4k

Thanks for the help!

I downloaded pyfaidx and now I'm getting text, but only one line. So it looks something like:



And then it stops after that first line. Any suggestions?

ADD REPLYlink written 5.9 years ago by mshumph20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 716 users visited in the last hour