Question

Separate according to Index

0

Entering edit mode

2.5 years ago

sakisugihara1403403 • 0

＊I use linux computer. ＊I've already done conda setup.

I got pair end data from Miseq. 　L001_R1_001.fastq.gz / L001_R2_001.fastq.gz

I used 6 types of index. ・GCGTGA ・AGGTCT ・TCAAGC ・TTACGT ・TAGGAC ・CGTATC

Please teach me how to Separate data according to Index. ・which app should i use? ・which comand should i write?

Miseq • 1.3k views

ADD COMMENT • link updated 2.5 years ago by GenoMax 154k • written 2.5 years ago by sakisugihara1403403 • 0

2

Entering edit mode

Please go through https://knowledge.illumina.com/software/cloud-software/software-cloud-software-reference_material-list/000001321 or contact Illumina support. If your samplesheet was properly setup then the demultiplexing (that is the index separation, one fastq file pair per index) should run automatically without that you need to bother with it. Don't do custom approaches as a rookie, the demultiplexing is so standard that you don't need to reinvent the wheel.

ADD REPLY • link 2.5 years ago by ATpoint 89k

0

Entering edit mode

I may have made a mistake in writing the sample sheet when there is an Index. I would appreciate it if you could post a site with the correct example.

ADD REPLY • link 2.5 years ago by sakisugihara1403403 • 0

1

Entering edit mode

maybe useful: split pooled paired-end fastq using fqkit

ADD REPLY • link 2.5 years ago by size_t ▴ 120

0

Entering edit mode

thank you. If you have time, could you please show me an example of one of the commands? I'm just starting out in programming and it's very difficult.

ADD REPLY • link 2.5 years ago by sakisugihara1403403 • 0

score 2 · Answer 1 · 2023-04-03

Use demuxbyname.sh from BBMap suite if you simply want to demultiplex your data that is in "Undetermined" pool files.

demuxbyname.sh in=<file> in2=<file2> out=<outfile> out2=<outfile2> names=GCGTGA,AGGTCT,TCAAGC etc.

in=<file>       Input file.
in2=<file>      If input reads are paired in twin files, use in2 for the second file.
out=<file>      Output files for reads with matched headers (must contain % symbol).
                For example, out=out_%.fq with names XX and YY would create out_XX.fq and out_YY.fq.
                If twin files for paired reads are desired, use the # symbol.  For example,
                out=out_%_#.fq in this case would create out_XX_1.fq, out_XX_2.fq, out_YY_1.fq, etc.

You can also correct the SampleSheet.csv and re-run the demultiplexing on MiSeq. Examples are in this thread from SeqAnswers.