Question: Inline barcodes in the reverse reads
0
gravatar for Picasa
2.9 years ago by
Picasa470
Picasa470 wrote:

Hi,

I have a sample of PE reads that I want to demultiplex. For this I used fastq-multx.

So for instance, my barcode is XXXX

And my Forward raw reads : XXXXCCTTGGGCATGATGGTGACGCGCTTGGCGTGGATGGCGCACAGGTTGGTGTCCTCGAACAGGCCGACCAGGTAGGCCTCGCTGGCCTCCTGCAG

After fastq-multx, this read has been correctly assigned and trimmed:

CCTTGGGCATGATGGTGACGCGCTTGGCGTGGATGGCGCACAGGTTGGTGTCCTCGAACAGGCCGACCAGGTAGGCCTCGCTGGCCTCCTGCAG

However, my Reverse read can be different.. Either I saw:

  • No barcode in the reverse read
  • Barcode (reverse complemented) in the 5' part: XXXXATGGCTCGTACCAAGCAGACCGCCCGCAAGT
  • Barcode (reverse complemented) within the R reads: ATGGCTCGTACCAAGCAGACCXXXXCGGAGGCAAGGCTCCCCGC

I'm not sure what I have to do with. Should I keep only the PE reads with the one that don't have barcode in the reverse reads ?

barcodes • 1.4k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Picasa470

How was the data generated? What is the cause that the barcode can end up everywhere (or not) in the reverse read?

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

It's an amplicon sequencing with custom barcodes.

I don't know how the barcode can be found in the Reverse read.

Just a precision that I forgot to mention: the barcode in the reverse read is the reverse complemented of XXXX

What is the "normal" process ? should the barcode be only found in the forward read ?

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Picasa470

That depends on the library prep. How was the library created? When/how were the barcodes attached? Without proper understanding of the experimental procedure we can't get this right.

I assume this is about the same data as in Confusion about barcodes and removal

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

Yes this is the same dataset.

The procotol is based on:

https://www.ncbi.nlm.nih.gov/pubmed/20516186

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Picasa470
1

In that protocol I found the following (page4, figure 1):

Ligation is nondirectional and also produces molecules which have the same adapters attached to both ends (not depicted). Such molecules do not interfere with sequencing and—due to the formation of hairpin structures—amplify very poorly during indexing PCR.

So that explains why you have some fragments with barcodes on both sides. Essentially you should only have a barcode on one end. Question now is how frequent you saw the barcode in the reverse read.

Based on your explanation your barcode is only 4 characters long, so that means it can also be present by chance in the read, therefore you need to look for its expected context: the illumina P7 sequence.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

The XXXX was just an example to simplify. In fact, the length is 7pb.

So if I grep the reverse complement of the barcode in the Reverse read, I find 75695/118664 which correspond to 64%.

Maybe should I keep the PE with

  • No barcode in the reverse read
  • Barcode (reverse complemented) in the 5' part

And I discard the :

  • Barcode (reverse complemented) within the R reads:

?

ADD REPLYlink written 2.9 years ago by Picasa470

Are the barcodes at the beginning of the read in your grep (if that is where they are supposed to be)? As @Wouter already said you should find the barcode only one time but it can be at either end.

ADD REPLYlink written 2.9 years ago by genomax71k

So there is 39066/118664 (33%) reverse reads that have the reverse complemented barcode in it's 5'.

And so 36629/118664 (31%) reverse reads that have the reverse complemented barcode somewhere in the read.

So if I understand, I should discard all the PE that have the reverse complemented barcode (at the beginning or middle) in it's reverse reads ?

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Picasa470

The adapters are ligated using blunt end ligation and as such it's not impossible that fragments end up with two barcodes. However, if I'm not mistaken these shouldn't get sequenced since they contain the same adapter on both sides and therefore won't get amplified by bridge amplification. The barcode should always be at the P7 side of the amplicon so I would suggest OP to look for that sequence.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

I just noticed the protocol you shared doesn't use inline barcodes.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

It was based on that paper but has been modified lightly.

ADD REPLYlink written 2.9 years ago by Picasa470
1

Then you might want to @#$%'ing consider telling us what you modified instead of having us take guesses to what you have been doing. Really, provide this information upfront because this is a waste of time. The past hour this thread has only been about the experimental procedure and we haven't started yet on the barcode processing. You made us look through protocols and now we have to find out that you modified the protocol - on a vital point apparently. This topic and the previous is quite a pain in the elbow to get a good understanding of what your question really is about.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k
1

If it is using inline barcodes then that is not a light modification.

I think you have enough information already to find the right solution.

ADD REPLYlink written 2.9 years ago by genomax71k

For future reference, please do not post links to sites behind a paywall - not everyone has access. It's better to copy/paste the relevant information in your post.

ADD REPLYlink written 2.9 years ago by harold.smith.tarheel4.4k

For those who don't have access here is a dropbox link https://www.dropbox.com/s/v3xrola70fhzwyk/meyer2010.pdf?dl=0 I ehm perfectly ahum legal obtained that file cough and share this totally anonymously.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k

And you expect us to click on a dropbox link for a file that is anonymously shared :)

ADD REPLYlink written 2.9 years ago by genomax71k

I do not necessarily "expect" that, I provide the opportunity. It's up to you to gamble whether it will be save or not ;) And there is always http://sci-hub.cc/ for those who want to obtain the paper the same way.

ADD REPLYlink written 2.9 years ago by WouterDeCoster40k
1

Thanks @WouterDeCoster, but I know how to access the reference. I was trying to encourage better behavior by the OP.

ADD REPLYlink written 2.9 years ago by harold.smith.tarheel4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2014 users visited in the last hour