Problems With Mirdeep2
1
1
Entering edit mode
11.3 years ago
pmuench ▴ 140

I want to find miRNAs on a fastq file with mirDeep2. After adapter clipping and alignment (with mapper.pl from mirDeep2) against human_g1k_v37 (indexed with bowtie) I get following error message:

Error: Genome file /PathToFile/human_g1k_v37.fasta has not allowed whitespaces in its first identifier

The first line of the reference file looks like this:

>1 dna:chromosome chromosome:GRCh37:1:1:249250621:1

If I delete all white spaces in the identifiers, and rerun the mapping and mirDeep2, I get an other error message:

The mapped reference id 1 from file exmaple_collapsed.arf is not an id of the genome file /PathToFile/human_g1k_v37.fasta

Any idea how i have to convert the reference file? Thank you!

hg19 reference • 8.6k views
ADD COMMENT
10
Entering edit mode
11.3 years ago
JC 13k

You need to rename the sequences in your reference fasta, just remove everything after the space:

 perl -plane 's/\s+.+$//' < genome.fa > new_genome.fa

You will need to create a new index.

ADD COMMENT
0
Entering edit mode

Hi,

I know this post has been out for a while but I'd appreciate some help. I tried a few data sets on mirdeep2 and I always got this error:

First line of FASTA reads file is not in accordance with the fasta format specifications
Please make sure your file is in accordance with the fasta format specifications and does not contain whitespace in IDs or sequences

***** Please check if the option you used (options c) designates the correct format of the supplied reads file earthwormShort1.fa *****

I have tried to remove whitespaces several times to ensure my fasta file is OK but the error keeps returning. Please, kindly help.

ADD REPLY

Login before adding your answer.

Traffic: 2316 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6