Question

NextSeq/Neoprep Stranded RNA-Seq Count Issue

0

Entering edit mode

7.5 years ago

bruce.moran ▴ 960

Hi,

I have a few NextSeq projects back, all libraries made on the Neoprep by same technician in the same lab. I used my standard pipeline which trims adapters and low quality bases (BBDuk), aligns with STAR and counts with featureCounts. I use s=2 flag for 'inversely stranded' library because that is what our Illumina libraries have been. I used this for the new Neoprep/NextSeq data, I checked with Illumina tech support after my issue below.

My problem: one project shows only ~5-10% counts (of total fragments) for s=2. However, when I switch to s=1, I get ~40-60% counts. Has anyone seen this behaviour previously? Any idea on what might cause this? Typically, it is the most time-sensitive study, with most precious samples. It is from fresh-frozen tumour tissue, so not great quality, other projects are cell-line and PDX models, so a mix of perfect and very good qualities, and data is well behaved. All libraries have very low rRNA, and more than 80% aligns to the transcriptome.

Some basic run stats, for a single selected (indicative) BAM flags for 83=6,392,271, 99=7,835,476, so not unevenly distributed to one strand (as far as my understanding of the flags, open to correction).

Summary of count matrix ('forward-stranded' -> s=1):

summary(forward_stranded)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
    7,041  6,980,000 11,800,000 13,690,000 19,360,000 38,700,000

summary(inverse_stranded)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
    842 1,183,000 1,717,000 2,085,000 2,995,000 5,492,000

Neoprep RNA-Seq library counts • 1.4k views

ADD COMMENT • link 7.5 years ago by bruce.moran ▴ 960