Question: demultiplexing tool for dual-indexed paired-end illumina libraries
3
gravatar for nbhardwaj
2.6 years ago by
nbhardwaj110
United States
nbhardwaj110 wrote:

Hi, What are the some of the tools out there to demultiplex dual-indexed illumina libraries where different combinations of i7 and i5 indices are used for paired-end data? I have already tried fastq_multx and encountered an error.

Thanks!

demultiplexing • 3.1k views
ADD COMMENTlink modified 2.6 years ago by Gabriel R.2.5k • written 2.6 years ago by nbhardwaj110
2
gravatar for genomax
2.6 years ago by
genomax55k
United States
genomax55k wrote:

What sort of data do you have? Fastq files or BCL files?

Take a look at the demuxbyname.sh tool from BBMap suite.

With BCL files you could look at IlluminaBasecallsToFastq from Picard.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by genomax55k

I have one set of paired-end multiplexed fastq file (after converting the BCL files to one giant paired-end fastq).

ADD REPLYlink written 2.6 years ago by nbhardwaj110

Then use demuxbyname.sh from BBMap.

$ demuxbyname.sh in=r#.fq out=out_%_#.fq prefixmode=f names=GGACTCCT+GCGATCTA,TAAGGCGA+TCTACTCT,...
outu=filename

"Names" can also be a text file with one barcode per line (in exactly the format found in the read header). You do have to include all of the expected barcodes, though.

In the output filename, the "%" symbol gets replaced by the barcode; in both the input and output names, the "#" symbol gets replaced by 1 or 2 for read 1 or read 2. It's optional, though; you can leave it out for interleaved input/output, or specify in1=/in2=/out1=/out2= if you want custom naming.

ADD REPLYlink written 2.6 years ago by genomax55k
1
gravatar for Gabriel R.
2.6 years ago by
Gabriel R.2.5k
Center for Geogenetik Københavns Universitet
Gabriel R.2.5k wrote:

You can use our maximum-likelihood demultiplexing tool, read our paper here:

http://www.ncbi.nlm.nih.gov/pubmed/25359895

the website with the software is here: https://grenaud.github.io/deML/

Hope this helps, contact me if you have trouble running it.

ADD COMMENTlink written 2.6 years ago by Gabriel R.2.5k

Hi gabriel, I tried deML and ran into an error. My barcodes file is:

Index1 Index2 Name

CCCAACCT CTAATCGA NA12877_A1 CACCACAC CTAATCGA NA12877_A2 GAAACCCA CTAATCGA NA12877_A3 TGTGACCA CTAATCGA NA12877_B1 AGGGTCAA CTAATCGA NA12877_B2 AGGAGTGG CTAATCGA NA12877_B3 CCCAACCT CTAGAACA NA12878_A1 CACCACAC CTAGAACA NA12878_A2 GAAACCCA CTAGAACA NA12878_A3 TGTGACCA CTAGAACA NA12878_B1 AGGGTCAA CTAGAACA NA12878_B2 AGGAGTGG CTAGAACA NA12878_B3

I ran the command: $ deML -i index.txt -f Undetermined_S0_L001_R1_001.fastq.gz -r Undetermined_S0_L001_R2_001.fastq.gz

The error was: If fastq is used, the forward read must be specified

If the indices are already in index.txt, what is contained in -if1 and -if2? I don't have any more fastq files from Illumina. Thanks!

ADD REPLYlink written 2.6 years ago by nbhardwaj110

The "-if1" and "-if2" are the fastq for the index1 and index2 from the reads respectively.

BTW, if you want to simplify your processing, I suggest transforming your BCL directly to aligned BAM:

https://github.com/grenaud/BCL2BAM2FASTQ

Recent versions of samtools provide commands for transforming to fastq if need be.

ADD REPLYlink written 2.6 years ago by Gabriel R.2.5k

Hi Gabriel, I don't have these fastq files, can I quickly create them using the indices that I have from the sampleSheet? Thanks!

ADD REPLYlink written 2.6 years ago by nbhardwaj110

No these index files are the indices from your reads, not your samples, are your indices from the reads in the definition lines (starts with @) in Undetermined_S0_L001_R1_001.fastq.gz or do you have different files besides those?

ADD REPLYlink written 2.6 years ago by Gabriel R.2.5k

Based on this other thread it appears to be the case: demultiplexing Illumina output with fastq_multx

ADD REPLYlink written 2.6 years ago by genomax55k

Ok, well then, @nbhardwaj, you can copy these indices in their own fastq files and add some quality scores that represent an average error rate?

ADD REPLYlink written 2.6 years ago by Gabriel R.2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1823 users visited in the last hour