Question: De-multiplex of no illumina index RNAseq libraries on Novaseq
0
gravatar for linjc.xmu
5 weeks ago by
linjc.xmu10
linjc.xmu10 wrote:

Dear all, I made a set of RNA-seq libraries without illumina index embedded, but with inner barcode right after read 1 sequences. Now they were sequenced on Novaseq platform with others libraries. Where could I get my data? In undetermined data, or in the data marked by GGGGGG index? Thanks a lot.

sequencing • 180 views
ADD COMMENTlink modified 5 weeks ago by Devon Ryan85k • written 5 weeks ago by linjc.xmu10
1

So where exactly is your barcode?

Like this?:

5'--|--Adapter--|--Barcode--|--RNAseq/cDNA--|--Adapter--|--3'

if so, how long are the cDNA fragments and what was the read length of your run?

ADD REPLYlink written 5 weeks ago by ATpoint7.9k

My humble recommendation, after several long unresolved discussions with sequencing guys is to search your internal barcode in both "undetermined" and G8. For 8 base codes, at a precise known location, error probability is low.

ADD REPLYlink written 5 weeks ago by jomo018410

Yes. The barcode location is right. My insertion size is ~180-375 bp. Read length is PE150. Sequencing facility sent me G8 data split by my barcode. But the unique mapping rate is low (~47%), multi-alignment rate is ~50%. Usually, I got 90% unique mapping for arabidopsis samples on Hiseq2500. So I am splitting data again from undetermined data as Devon said. I am not sure which one could be used. Or merge both?

ADD REPLYlink written 5 weeks ago by linjc.xmu10
2
gravatar for Devon Ryan
5 weeks ago by
Devon Ryan85k
Freiburg, Germany
Devon Ryan85k wrote:

If the samples lacked standard Illumina barcodes they'll be mostly in Undetermined.

ADD COMMENTlink written 5 weeks ago by Devon Ryan85k

Thanks. Sequencing company said Novaseq generates a GGGGGG index file (reads) naturally. What's this?

ADD REPLYlink written 5 weeks ago by linjc.xmu10

Machines with 2-color chemistry (NextSeq and NovaSeq) can see no signal for a G, but unless you put that in your sample sheet (terrible idea) you'll see the reads for it in Undetermined.

ADD REPLYlink written 5 weeks ago by Devon Ryan85k

Thanks. Do you mean it's better to get data from Undetermined one?

ADD REPLYlink written 5 weeks ago by linjc.xmu10

I would say there shouldn't be a G8 "sample" to begin with, unless the the NovaSeq produces that by default for some reason (it's about the only Illumina machine we don't have, so I can't check). The thing with unbarcoded samples is that signal from neighboring clusters has a way of bleeding over into them during the index reads, so they will often not be purely no signal (a G in 2 color chemistry).

ADD REPLYlink written 5 weeks ago by Devon Ryan85k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1660 users visited in the last hour