0
1
Entering edit mode
5.1 years ago
Picasa ▴ 610

Hello,

I have a list of sequence that I want to demultiplex.

1) The barcode is it always at the beginning (5') of the read ?

2) I'm looking for a soft like fastx_barcode_splitter.pl from FASTX toolkit but this one doesn't trim the barcodes.

Do you know a know that split and trim ?

barcodes ngs • 2.6k views
3
Entering edit mode

barcodes are never a part of the actual read (for standard illumina barcodes) unless your barcodes are designed in this experiment to be "in-line".

That said take a look at sabre package. It may do what you want.

0
Entering edit mode

I am not sure if I was clear but Im in the situation (c) in this figure and I want to go to the (d)

http://www.illumina.com/content/dam/illumina-marketing/images/technology/multiplexing-overview-figure.gif

So for each sample I have a barcode information (the sequence and its reverse complement) and I want to keep the paired end (because I have PE data) .

1) So for the sabre package, I used the PE mode with the barcode in its F sens. is it right ?

0
Entering edit mode

That figure your linked is for standard illumina barcodes. Even though they are shown "inline" in that illustration, that part is read as an independent read(s) on the sequencer. These would generally be handled by Illumina's own bcl2fastq software.

Have you looked at the demultiplexed result from Sabre to check if the reads have been correctly separated?

0
Entering edit mode

Yes it has been correctly separated .

The R1 reads has been trimmed correctly but teh R2 remain the same, that's why I am not sure if I'm doing right

0
Entering edit mode

Sorry to harp on this but can you clarify if you are using standard illumina barcodes or custom barcodes that are designed to be inline? Perhaps you can post a couple of example reads to illustrate the before/after scenario.

0
Entering edit mode

This is ampliconseq with custom barcodes. We have gene from different species that we sequenced on the same lane. My goal is make 2 fastq files (PE) for each specie.

For instance with this pair of read:

@blabla 1:N:0:CGATGT
CGCTTGAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAACACACAATGGCTACGT

@blabla 2:N:0:CGATGT
TCAAGCGAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAAAAAAAAGAAAAAAGGAAAGGG


I know that the barcode CGCTTGA (sens F) correspond to the sample X.

So after sabre:

>head X_R1.fq

@blabla 1:N:0:CGATGT
AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAACACACAATGGCTACGT

@blabla 2:N:0:CGATGT
TCAAGCGAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAAAAAAAAGAAAAAAGGAAAGGG

0
Entering edit mode

Barcode is only expected to be on R1, correct (so R2 should be left as is)?

0
Entering edit mode

I dont' know.. Is it the standard protocol ?

0
Entering edit mode

Since R1/R2 are coming from the same DNA fragment you need to label only one-end for basic demux. One may do both ends if it is something more complex (not an experimental person, so can't think of a scenario where that would be needed).

0
Entering edit mode

Thanks you for your support anyway.

There is an option -c to trim the R2 reads too (first 7 bp), but I'm not sure If I have to use it..

0
Entering edit mode

You would have to do that only in a case if you had a barcode at both ends of the expected fragment. You would need to ask those who designed this amplicon.

0
Entering edit mode

Both ends can be barcoded if sequencing many samples where just barcoding one side wouldn't give enough index possibilities.

0
Entering edit mode

https://github.com/najoshi/sabre/commit/2a1bedc9a53fd03420d7a3b11f406efa60f90ba1

Check this out it looks like there is an option :

--h, --both-barcodes, Optional flag that indicates that both fastq files have barcodes.\n\ +-c, --both-barcodes, Optional flag that indicates that both fastq files have barcodes.\n\