Bowtie2 error :: (ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)
0
0
Entering edit mode
4.0 years ago
shrutidabral ▴ 10

I am trying to do alignment to reference genome hg19 using bowtie2 2.4.1.

I have total 10 sample form which 5 raw file giving error during alignment using same command. Following error i am facing for some files ::

Command

/bowtie2-2.4.1/bowtie2 --no-discordant --no-mixed --local  -p 20  -x  /WXS_test/index/hg19 -1 /data/fastqfile/R_1.fastq.gz -2 /data/fastqfile/R_2.fastq.gz -S  /WXS_test/alignment/R.sam    

Error, fewer reads in file specified with -1 than in file specified with -2
terminate called after throwing an instance of 'int'
(ERR): bowtie2-align died with signal 6 (ABRT) (core dumped)

Action taken

Quality check of raw file using FASTQC tool . All files pass the criteria. * No memory issues.

Is their any other way to check the fastq files ,if error is related to raw files ?

next-gen SNP alignment • 4.1k views
ADD COMMENT
0
Entering edit mode

make sure the read number in R_1.fastq.gz and R_2.fastq.gz are identical. check is out in fastqc output, "Summary".

What are the numbers?

ADD REPLY
0
Entering edit mode

Length of reads R1 : 31988936 R2 : 32047104

ADD REPLY
0
Entering edit mode

Error, fewer reads in file specified with -1 than in file specified with -2

The files are of unequal length. Your data or published data? Did you trim data and did not use a dedicated paired-end trimmer? Did you manipulate the files? Try to find out why there are not equal numbers of reads in both files. You can try repair.sh from bbmap but I would try to sniff out the reason first.

ADD REPLY
0
Entering edit mode

Length of reads R1 : 31988936 R2 : 32047104

I did not trim or manipulate the data . this data is published one. This data is in SRA files , I supposed to convert them directly into bam files but due to error :: [E::sam_hrecs_error] Malformed key:value pair at line 86: "@RG ID:PM164 PL:Illumina LB:GA LNID:L001 FCID:H9CB8ADXX DT:2014-04-21T00:00:00-0400 BCID:AGTACAAG SM:PM164_X1_1_Case" could not complete this .

Should i run trimmomatic or can you recommend some other tools ?

ADD REPLY
0
Entering edit mode

Please post all commands related to download and conversion from sra to fastq. I guess the file contains singletons, did you use split-3 option with fastq-dump? If not but rather split-files than then I suggest to simply run repair.sh from bbmap to remove the singletons. It can of course also be that the uploader simply messed things up by uploading corrupted files. Can you share an accession number?

ADD REPLY
0
Entering edit mode

I have converted srr to fastq file by using " -I --split-files " as a results the read in both file have different number and when i used " split-3 " function it gives same number of reads in both R1 and R2 files .

I am not able to understand difference between these two parameters . Out of 10 raw file split-file command is not producing correct fastq reads .I have report this to author also. Is it somewhere issue in uploading or something else .

ADD REPLY
0
Entering edit mode

Would you paste the command here? and also one of the SRR number.
If it is paired end reads, use --split-3 will save each read into separate files. (Read 1, read 2, and orphaned reads(if exists) ). but --split-files will not.

So, try --split-3 if you can make sure the SRA contains Paired-end reads.

Last, here is a nice tool, https://sra-explorer.info/; you can find the links for SRA and also raw fastq files. (both NCBI and EBI).

ADD REPLY
0
Entering edit mode

I can't give you SRR number sorry.

This description is from https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=fastq-dump"" fastq-dump -I --split-files SRR390728 Produces two fastq files (--split-files) containing ".1" and ".2" read suffices (-I) for paired-end data.

ADD REPLY
1
Entering edit mode

Fine, the details what you did help other to reproduce your problem.
I guess, one possible reason, the sra file is truncated that downloaded by fastq-dump (not tested),
Anyway, it is recommended to download files using prefetch, not fastq-dump.

if you tried as exactly as the example, it means fastq-dump will download the *sra file first, then convert it to fastq file. (it is not recommend in this way)

$ fastq-dump -I --split-files SRR390728

Here is a example: using prefetch (also suggested by NCBI). And then convert it using fasterq-dump.

$ prefetch SRR390728

# --split-3
$ fasterq-dump --split-3 SRR390728.sra
$ seqkit stat SRR390728.sra_1.fastq SRR390728.sra_2.fastq
file                   format  type   num_seqs      sum_len  min_len  avg_len  max_len
SRR390728.sra_1.fastq  FASTQ   DNA   7,178,576  258,428,736       36       36       36
SRR390728.sra_2.fastq  FASTQ   DNA   7,178,576  258,428,736       36       36       36

# --split-files
$ fasterq-dump --split-files -p SRR390728.sra
$ seqkit stat SRR390728.sra_1.fastq SRR390728.sra_2.fastq
file                   format  type   num_seqs      sum_len  min_len  avg_len  max_len
SRR390728.sra_1.fastq  FASTQ   DNA   7,178,576  258,428,736       36       36       36
SRR390728.sra_2.fastq  FASTQ   DNA   7,178,576  258,428,736       36       36       36
ADD REPLY
0
Entering edit mode

The two files fastq files might not a pair of PE reads, or one of the file is truncated.

If not from a pair of PE reads

you could check the first reads (eg: 10 reads) names. if they are identical in two files:

$ zcat R_1.fastq.gz | head -n 40 | grep ^@
$ zcat R_2.fastq.gz | head -n 40 | grep ^@

If the name list are not identical, you should find the correct source of the reads.

If reads from correct PE reads

you can subset the R_2.fastq.gz to the same length of R_1.fastq.gz.

$ zcat R_2.fastq.gz | head -n 127955744‬ | gzip > R_fixed_2.fastq.gz  # 31988936 * 4 = 127955744‬

Then you can give it a try: R_1.fastq.gz and R_fixed_2.fastq.gz for alignment.

ADD REPLY
1
Entering edit mode

This advice is not correct. By doing this you are bound to mess up the order of reads in the files.

I am going to move this answer to a comment. Please explain if you feel your solution is correct.

shrutidabral : Please use repair.sh from BBMap suite to make sure your read order is set right and singletons removed to a different file. This generally happens if you trim paired-end reads independently.

repair.sh in1=broken1.fq in2=broken2 out1=fixed1.fq out2=fixed2.fq outs=singletons.fq repair
ADD REPLY
0
Entering edit mode

@genomax, Thanks, I did not consider the situation, PE reads trimmed separately, as you mentioned. And repair.sh is a good solution for two files, read names was messed up.

my previous comment is for the situation, reads are not in correct pairs, or files are truncated.

If it is true, trimmed separately, I will re-do the trimming in PE mode.

ADD REPLY
0
Entering edit mode

i did not preform any trimming in this data .

ADD REPLY
0
Entering edit mode

repair.sh from bbmap does all this automatically while being more reliable since your solution will not correct for any corrupted entires within the file. Maybe for whatever reason some reads are out-of-sync or are singletons. I assume either something went wrong during sra2fastq conversion or the file contains singetons in which case fastq-dump was probably not used with split-3.

ADD REPLY
0
Entering edit mode

OK i will follow up with your suggestion will report you back . thank you for clarity .

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6