Question

bwa mem mapping multi-thread/single thread

0

Entering edit mode

6.2 years ago

whaiyu06 ▴ 80

Hi everyone :

I used the software cutadapt to remove the adapter after I received the raw data from sequencing company.(The command is:

cutadapt \
            -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC \
            -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT \
            -o A01_R1.clean.fastq.gz -p A01_R2.clean.fastq.gz \
        A01_R1.fastq.gz  A01_R2.fastq.gz )

After I gained the clean data,I mapped short reads of WGS to the hg19 reference using BWA mem.But but the sam file of the output didn't update after run more than 24h or more longer even the Linux Server showed program still running in the background,when I used multi-thread to run the command

nohup bwa mem -t 4 -M -R "@RG\tID:A01\tSM:A01\tPL:illumina\tLB:lib1\tPU:unit1" /public1/data/reference/public/Homo_sapiens_assembly19.fasta /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R1.clean.fastq.gz /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R2.clean.fastq.gz 1>A01.sam 2>stderr/A01.bwa.stderr &.

The log of A01.bwa.stderr using the multi-thread  as follows: 
[M::mem_pestat] mean and std.dev: (1405.67, 1625.87)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 10748)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (1291, 3007, 3959)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9295)
[M::mem_pestat] mean and std.dev: (3058.62, 2334.28)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 12396)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR

Strangely,when I changed single-thread, the output was fine.It looks like OK,because the log as follows:

[M::mem_pestat] (25, 50, 75) percentile: (303, 344, 394)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (121, 576)
[M::mem_pestat] mean and std.dev: (349.67, 66.67)
[M::mem_pestat] low and high boundaries for proper pairs: (30, 667)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 14216 reads in 6.793 CPU sec, 6.725 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -M -R @RG\tID:A01\tSM:A01\tPL:illumina\tLB:lib1\tPU:NONE 
/public1/data/reference/public/Homo_sapiens_assembly19.fasta /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R1.clean.fastq.gz /public1/barcode/WGS_SC/PF_data/SingleCellAnal/A01_R2.clean.fastq.gz
[main] Real time: 123044.077 sec; CPU: 123276.915 sec

So,I'm confused very much if the operation of cutadapt makes the pair-end fq files incomplete.And why the single-thread seem to make no mistakes?Whether the output after run single thread command is believable? Too many questions and thanks for your time. Best

alignment sequencing • 6.0k views

ADD COMMENT • link updated 4.3 years ago by ATpoint 81k • written 6.2 years ago by whaiyu06 ▴ 80

0

Entering edit mode

Actually you need to paste the command line you used, otherwise we can not analyze the problems. Thanks.

ADD REPLY • link 6.2 years ago by jinxinhao1988 ▴ 70

0

Entering edit mode

the command line I used is：

nohup bwa mem -t 4 -M -R "@RG\tID:A01-2m-B8\tSM:A01-2m-B8\tPL:illumina\tLB:lib1\tPU:unit1" /public1/data/reference/public/Homo_sapiens_assembly19.fasta /public1/barcode/WGS_SC/PF_data/SingleCellAnal/trimmed.A01-2m-B8_R1.fastq.gz /public1/barcode/WGS_SC/PF_data/SingleCellAnal/trimmed.A01-2m-B8_R2.fastq.gz 1>A01-2m-B8.sam 2>stderr/A01-2m-B8.bwa.stderr &

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 6.2 years ago by whaiyu06 ▴ 80

0

Entering edit mode

Do you check the nohup.out file and stderr file to see if there is any error message?

ADD REPLY • link 6.2 years ago by jinxinhao1988 ▴ 70

0

Entering edit mode

Yes,I have checked them ,but there's nothing wrong have been reported.

ADD REPLY • link 6.2 years ago by whaiyu06 ▴ 80

0

Entering edit mode

It is possible that a lot of stuff is happening in memory, but 24h is suspicious. What's the config on the machine you're running bwa on?

ADD REPLY • link 6.2 years ago by Ram 43k

0

Entering edit mode

I used 32 threads ，8 cpu cores of the Linux server to map the reads.I don't know the other config of the machine ,but if it is necessary I'll check them.Thank you!

ADD REPLY • link 6.2 years ago by whaiyu06 ▴ 80

score 0 · Answer 1 · 2020-01-14

0

Entering edit mode

4.3 years ago

ATpoint 81k

Just to answer this after 23 month: Most likely (matching my own experience) your job got killed due to memory shortage. You can always check with top -u your_user_name if your job is still running.

ADD COMMENT • link 4.3 years ago by ATpoint 81k