Question: bbmap split paired-end reads back into separated fastq files?
0
gravatar for c.e.chong
11 weeks ago by
c.e.chong0
c.e.chong0 wrote:

Hi,

I am trying to remove human reads from my metagenomic sequences (skin samples).

I have used the command recommended in the bbmap documentation:

bbmap.sh minid=0.95 maxindel=3 bwr=0.16 bw=12 quickmatch fast minhits=2 path=/path/to/hg19masked/ qtrim=rl trimq=10 untrim -ea -Xmx100g in=reads.fq outu=clean.fq outm=human.fq threads=12

My input is Illumina HiSeq 4000 Paired-end, 2x150 bp reads, so I want to split paired-end reads back into separated fastq files .._r1 .._r2.

Is there a way to do this using BBTools?

Thanks in advance!

ADD COMMENTlink modified 11 weeks ago by genomax59k • written 11 weeks ago by c.e.chong0

As shown your command is only using one read file (in=). So unless that file is interleaved there is no chance that out*= files will be interleaved. @ATPoint shows what an interleaved file should look like. Note: Your headers may look different but Read 1 and 2 from one clusters should be one after the other in the file.

ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by genomax59k

Sorry that was a mistake, I have actually used in=r1.fq in2= r2.fq!

Thanks!

ADD REPLYlink written 11 weeks ago by c.e.chong0
1
gravatar for genomax
11 weeks ago by
genomax59k
United States
genomax59k wrote:

Use reformat.sh from BBMap suite. This assumes your clean.fq is interleaved.

reformat.sh in=clean.fq out1=clean_R1.fq out2=clean_R2.fq
ADD COMMENTlink written 11 weeks ago by genomax59k

Thank you very much for your help!

How would I find out if the clean.fq file is interleaved?

ADD REPLYlink written 11 weeks ago by c.e.chong0

Check if two adjacent read pairs always have the same read name. This is an example of an interleaved file from a HiSeq3000:

@HWI-ST-J00104_BSF_0515:8:1101:10003:10036#DMSO_rep2_S43794/1
GGAAGAAAGACAGTCTGTGGCCCTGCCTGGGGACCTACACTGTCTGCTGTAACAGGCTTTCCCTTCATCTCAAGAG
+
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10036#DMSO_rep2_S43794/2
GGTTGAAAAGTGGCCTCGGTGTTCCAGACCTACCTGGTTCACTTAGCTTTTTCCTCCTTTCCTTCCTTTTATACTC
+
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJFJJJJJJJFJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10141#DMSO_rep2_S43794/1
GTGATTCTCTTCAAGGCCACTCCTAAACAACTGTTGAATCCTTGATCCAGGCACCAAGCCCAGAGGTCCCTATCAG
+
AAAFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ<FJJJJJJJJJJJJJFJJJJJJJJJJJ
@HWI-ST-J00104_BSF_0515:8:1101:10003:10141#DMSO_rep2_S43794/2
GTCTAGGAGGAAACCAGGCTGTTAACTCCTCAGCAAAGACAAAGGAGGAGTTGGGCGGGGGCAAGACTAACTCCTA
+
AAFFFJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJAJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
ADD REPLYlink modified 11 weeks ago • written 11 weeks ago by ATpoint11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1813 users visited in the last hour