Question: bbmap split paired-end reads back into separated fastq files?
0
gravatar for c.e.chong
5 months ago by
c.e.chong10
c.e.chong10 wrote:

Hi,

I am trying to remove human reads from my metagenomic sequences (skin samples).

I have used the command recommended in the bbmap documentation:

bbmap.sh minid=0.95 maxindel=3 bwr=0.16 bw=12 quickmatch fast minhits=2 path=/path/to/hg19masked/ qtrim=rl trimq=10 untrim -ea -Xmx100g in=reads.fq outu=clean.fq outm=human.fq threads=12

My input is Illumina HiSeq 4000 Paired-end, 2x150 bp reads, so I want to split paired-end reads back into separated fastq files .._r1 .._r2.

Is there a way to do this using BBTools?

Thanks in advance!

ADD COMMENTlink modified 5 months ago by genomax62k • written 5 months ago by c.e.chong10

As shown your command is only using one read file (in=). So unless that file is interleaved there is no chance that out*= files will be interleaved. @ATPoint shows what an interleaved file should look like. Note: Your headers may look different but Read 1 and 2 from one clusters should be one after the other in the file.

ADD REPLYlink modified 5 months ago • written 5 months ago by genomax62k

Sorry that was a mistake, I have actually used in=r1.fq in2= r2.fq!

Thanks!

ADD REPLYlink written 5 months ago by c.e.chong10
1
gravatar for genomax
5 months ago by
genomax62k
United States
genomax62k wrote:

Use reformat.sh from BBMap suite. This assumes your clean.fq is interleaved.

reformat.sh in=clean.fq out1=clean_R1.fq out2=clean_R2.fq
ADD COMMENTlink written 5 months ago by genomax62k

Thank you very much for your help!

How would I find out if the clean.fq file is interleaved?

ADD REPLYlink written 5 months ago by c.e.chong10

Check if two adjacent read pairs always have the same read name. This is an example of an interleaved file from a HiSeq3000:

@HWI-ST-J00104_BSF_0515:8:1101:10003:10036#DMSO_rep2_S43794/1
GGAAGAAAGACAGTCTGTGGCCCTGCCTGGGGACCTACACTGTCTGCTGTAACAGGCTTTCCCTTCATCTCAAGAG
+
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10036#DMSO_rep2_S43794/2
GGTTGAAAAGTGGCCTCGGTGTTCCAGACCTACCTGGTTCACTTAGCTTTTTCCTCCTTTCCTTCCTTTTATACTC
+
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJFJJJJJJJFJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10141#DMSO_rep2_S43794/1
GTGATTCTCTTCAAGGCCACTCCTAAACAACTGTTGAATCCTTGATCCAGGCACCAAGCCCAGAGGTCCCTATCAG
+
AAAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ<FJJJJJJJJJJJJJFJJJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10141#DMSO_rep2_S43794/2
GTCTAGGAGGAAACCAGGCTGTTAACTCCTCAGCAAAGACAAAGGAGGAGTTGGGCGGGGGCAAGACTAACTCCTA
+
AAFFFJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJAJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
ADD REPLYlink modified 5 months ago • written 5 months ago by ATpoint13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1608 users visited in the last hour