De-multiplex of no illumina index RNAseq libraries on Novaseq
1
0
Entering edit mode
5.6 years ago
linjc.xmu ▴ 30

Dear all, I made a set of RNA-seq libraries without illumina index embedded, but with inner barcode right after read 1 sequences. Now they were sequenced on Novaseq platform with others libraries. Where could I get my data? In undetermined data, or in the data marked by GGGGGG index? Thanks a lot.

sequencing • 2.5k views
ADD COMMENT
1
Entering edit mode

So where exactly is your barcode?

Like this?:

5'--|--Adapter--|--Barcode--|--RNAseq/cDNA--|--Adapter--|--3'

if so, how long are the cDNA fragments and what was the read length of your run?

ADD REPLY
0
Entering edit mode

My humble recommendation, after several long unresolved discussions with sequencing guys is to search your internal barcode in both "undetermined" and G8. For 8 base codes, at a precise known location, error probability is low.

ADD REPLY
0
Entering edit mode

Yes. The barcode location is right. My insertion size is ~180-375 bp. Read length is PE150. Sequencing facility sent me G8 data split by my barcode. But the unique mapping rate is low (~47%), multi-alignment rate is ~50%. Usually, I got 90% unique mapping for arabidopsis samples on Hiseq2500. So I am splitting data again from undetermined data as Devon said. I am not sure which one could be used. Or merge both?

ADD REPLY
2
Entering edit mode
5.6 years ago

If the samples lacked standard Illumina barcodes they'll be mostly in Undetermined.

ADD COMMENT
0
Entering edit mode

Thanks. Sequencing company said Novaseq generates a GGGGGG index file (reads) naturally. What's this?

ADD REPLY
1
Entering edit mode

Machines with 2-color chemistry (NextSeq and NovaSeq) can see no signal for a G, but unless you put that in your sample sheet (terrible idea) you'll see the reads for it in Undetermined.

ADD REPLY
0
Entering edit mode

Thanks. Do you mean it's better to get data from Undetermined one?

ADD REPLY
0
Entering edit mode

I would say there shouldn't be a G8 "sample" to begin with, unless the the NovaSeq produces that by default for some reason (it's about the only Illumina machine we don't have, so I can't check). The thing with unbarcoded samples is that signal from neighboring clusters has a way of bleeding over into them during the index reads, so they will often not be purely no signal (a G in 2 color chemistry).

ADD REPLY

Login before adding your answer.

Traffic: 1905 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6