Question: Error In Bedtools Getfasta: Chromosome Not Found
1
gravatar for Pat Baldrich
7.7 years ago by
Pat Baldrich10
Spain
Pat Baldrich10 wrote:

Hi,

I am triing to use BEDtools to get some sequences from genomic coordinates. But I am having an errors saying " WARNING. chromosome (chr12) was not found in the FASTA file. Skipping." for each read that I have in my bed file. I gave you some details about what I am doing.

I just download the last version of BEDtools (I think) bedtools-2.17.0.

Then I have 2 different files (much more longer that the little part that I show) :

A fasta file with all the sequences of chromosomes:

>chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

a BED file with my genomic coordinates (already sorted) chr01 187814 190840 chr01 307073 310104 chr01 701047 704068 chr01 702941 705962 chr01 702952 705972 chr01 867716 870740 chr01 914064 917087 chr01 991080 994104 chr01 1039795 1042815 chr01 1058713 1061736

And then I write the command line: bedtools getfasta -fi all.con -bed 1-13sorted2.bed -fo NewCandidates/Genomiccoordinates/1-13_1500.fa

The only thing that I get is "WARNING. chromosome (chr01) was not found in the FASTA file. Skipping." , thousands of times...

If someone can help me and tell me what I am doing wrong, I will be very grateful.

Thank you all of you in advance.

bedtools error • 7.4k views
ADD COMMENTlink modified 4.1 years ago by Prakki Rama2.4k • written 7.7 years ago by Pat Baldrich10

well if you just need fasta files for each chromosome (HG19) you can download it from ucsc genome browser: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/

ADD REPLYlink written 7.7 years ago by Gjain5.5k

your paste above is not fasta format. did the editor eat your ">"? should look like:

> chr01
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNn
ACTGCGCACTGA

etc.

ADD REPLYlink written 7.7 years ago by brentp23k

Yes, this happens because ">" is formatted as a blockquote UNLESS you indent lines with 4 spaces. Please note that questions are auto-previewed as you type, to help avoid this kind of problem. Fixed it for you.

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by Neilfws49k

I don't get it - is it chr01 or chr12 or both (all) of them?
What about - instead of using all.con try to use fasta file with only one chromosome in it and bed file with only that chromosome coordinates.

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by PoGibas4.8k

Thanks for your comment, but then I need to split all my files, and I have 24 different libraries per 12 chromosomes...If this is the only solution, I think is not the best solution for me...

ADD REPLYlink written 7.7 years ago by Pat Baldrich10

Just to test if it's working:
-fi chr01.fa -bed chr01.bed

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by PoGibas4.8k

Hi, I tried and I get exactly the same error: "WARNING. chromosome (chr01) was not found in the FASTA file. Skipping." Thanks again!

My command line in case of: bedtools getfasta -fi Chr1.con -bed NewCandidates/Genomiccoordinates/1-13chr01.bed -fo NewCandidates/Genomiccoordinates/1-131500.fa

ADD REPLYlink modified 7.7 years ago • written 7.7 years ago by Pat Baldrich10
5
gravatar for Istvan Albert
7.7 years ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

Your chromosome names do not match. Make sure the bed file has identically named chromosomes. Yours seem to be zero padded, I bet it is yeast.

for some recent ideas on the subject read this from the author of BedTools: What is in a (chromosome) name

ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by Istvan Albert ♦♦ 85k
1

It is worth noting, that if chromosome names in FASTA and BED files don't match and getfasta write that there is no index file and create it, one have to delete created index before trying running the procedure again on corrected files. Otherwise the problem will persist.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by boczniak767690

Istavan, that link is very nice. Thanks

ADD REPLYlink written 7.7 years ago by Gjain5.5k

Hi,

Thank you all for your answers. I am not working with human genome, but rice genome. So this make things more complicated.

What do you mean Istvan when you say chromosomes names do not match. Is chr01 in both files...

Thank you again!!

ADD REPLYlink written 7.7 years ago by Pat Baldrich10
1
gravatar for Prakki Rama
4.1 years ago by
Prakki Rama2.4k
Singapore
Prakki Rama2.4k wrote:

I encountered same error today. By deleting the old index file and running bedtools command automatically generated index for the fasta file which helped to resolve the issue.

ADD COMMENTlink written 4.1 years ago by Prakki Rama2.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1954 users visited in the last hour