File not suitable for fasta index generation.
1
0
Entering edit mode
4.3 years ago
ja4123 ▴ 20

Hey, I have a sorted and indexed .bam file, and I am running it through freebayes, with the script :

freebayes -f ref.fa -@ in.vcf.gz aln.bam >var.vcf

And I had error like this:

unable to find FASTA index entry for '1'

I have changed the headers in fasta file from chr1, chr2 ... to 1, 2, 3 .. to match vcf file by:

for i in {1..64185939}; do echo $i; done | paste - <(sed '/^>/d' hg38.fa) | sed -e 's/^/>/' -e 's/\t/\n/' > new.fa

then I run this script and I get the:

index file index.fai not found, generating...
ERROR: mismatched line lengths at line 39028338 within sequence
File not suitable for fasta index generation.

Could you give me some advice?

fasta index freebayes • 3.2k views
ADD COMMENT
3
Entering edit mode
4.3 years ago

The message is clear: the line lengths in your fasta files should all have the same size:

you want

> seq1
aaaaaaaaaaaaaaaaaaaaaaaaaaaa 
aaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaa
> seq2
aaaaaaaaaaaaaaaaaaaaaaaaaaaa 
aaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaa

you have

> seq1
aaaaaaaaaaaaaaaaaaaaaaaaaaaa 
aaa
aaaaaaaa
> seq2
aaaaaaaaaaaaaaaaaaaaaaaaaaaa 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaa

you could use https://broadinstitute.github.io/picard/command-line-overview.html#NormalizeFasta to fix the fasta.

ADD COMMENT

Login before adding your answer.

Traffic: 1589 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6