sequencing adapter/ index when mixing two library preparation kit
3 months ago
cwwong13

I am doing RNAseq and would like to pool my library with some other lab members. The problem is we are not using the same library preparation kit (and thus the index). I would like to know whether there are any incompatible concerns?

I check the index for these kits, I would like to know what should I look for? Here is my checklist:

1. check whether there are any identical indexes (which I use the index given in both user manual)
2. check whether the indexes are identical after trimming off 1 base from either end

My question is do I also need to check the complement version of these indexes? And also the reverse and reverse complement?

One of my kits provides a unique Dual-indexed adapter: P7 index sequence, P5 index sequences (1 and 2) ~~ this is from the KAPA kit. On the other hand, my lab member is using the NuGEN Ovation® SoLo RNAseq kit, which only provides one index sequence (of each well) for me.

May I know which sequence I should make the comparison to?

The final question is I found that one of the "reversed" indexes from the SoLo kit is identical to one of the indexes I used after trimming one 5' base.

SoLo original index: ACCATCCT
SoLo reversed index: TCCTACCA
SoLo reversed and trimmed: CCTACCA  #overlap found with this to next index sequence
KAPA P7 Index sequence: CCTACCAT


I know that the SoLo kit claim they will have 8 random bases following the provided index. Do you think if there is a random T following the CCTACCA will cause a problem? Or the demultiplexing algorithm can deal with the issue by knowing that the index is actually TCCTACCA T rather than T CCTACCAT. Of course, this is not a problem if I do not need to compare the index with its reverse sequencing in the counterpart.

3 months ago
GenoMax

You should only compare the i7 indexes. As long as they are non-overlapping it would be fine to pool these. Data may need to be demultiplexed separately.

3 months ago

The rev-comps are irrelevant. People tend to run bcl2fastq allowing a mismatch of 1, in which case, one and only one mismatch between your indices and someone else's will be a problem unless they dial down the allowed mismatches to zero.

May I know what do you mean by causing problems? what will be the behavior of the demultiplexing bcl2fastq? More specifically, I am interested in whether there will be no read at all for two (or multiple indexes) with conflict (with one mismatch)? Or the reads will be randomly assigned to each sample? I think am thinking whether we can still use the 1 mismatch at the first run, and dial down to zero if we found some of the samples do not have any reads.

If there is one and only one letter difference between two barcodes, bcl2fastq will refuse to run when the allowable # of mismatches = 1. Honestly, this is something the people generating the fastqs should know how to deal with. It shouldn't be up to submitters to work it out. Some collaboration with the people running the instrument should have happened before libraries were prepped to make sure they had a plan for how to run samples from different kits. This shouldn't be up to the submitters, who really shouldn't be expected to know the nitty-gritty of how demultiplexing works.

I was asking this question because I am worried about ruining my labmate's library if that will lead to failure in demultiplexing (in case bcl2fastq does not show a warning, hence reminding us to use mismatches = 0). Anyways, I sent out the libraries... and hope it works.

Honestly, this should not be up to the submitter to figure out. It should be up to the people running the instrument. They should understand what people are planning to submit, understand the ramifications of mixing different barcodes. It should be up to them to tell one of you "Your samples aren't compatible as you made them, one of you will be waiting for a later run". if that's what has to happen. In my group, we did have one barcode from one set that was incompatible with other barcodes, and we just told people who were submitting stuff pre-barcoded; "don't use barcode #2 from this kit". We would never expect submitters to be responsible for understanding interactions with other people's barcodes. (Just their own)