Question: Estimated Insert Size Less Than Read Pair Sizes (180Bp For 100X100 Pe), But Seqprep Merged 0 Reads?
gravatar for Ian Fiddes
8.4 years ago by
Ian Fiddes70
Santa Cruz
Ian Fiddes70 wrote:

I have some RNAseq that was done by a one-stop shop type company, aka RNA was submitted and then analyzed data as well as fastq files were returned. I am trying to run tophat/cufflinks analysis myself to compare. I used Bowtie Picard tools to empirically determine the insert size (following and the result was 180bp mean for both samples. I used a basic mrna.fa file from UCSC as my reference for bowtie.

However, the sequencing is 100x100 paired end, so I figured for some strange reason they did overlapping reads. So I ran the program SeqPrep to try and merge the reads, and it came up with exactly zero mergeable pairs. Does anyone know why this would be? The histogram generated by CollectInsertSizeMetrics.jar can be seen here:

rna-seq • 3.0k views
ADD COMMENTlink modified 8.1 years ago by Rm8.0k • written 8.4 years ago by Ian Fiddes70
gravatar for Rm
8.1 years ago by
Danville, PA
Rm8.0k wrote:

Overlapping RNA-seq reads is pretty common: How are the QC's for these reads? If very bad, you wont expect many overlaps: but i am surprised to see zero merged reads?:

One test I suggest: Take a 100,000 paired reads and convert them to fasta and then using megablast search read1 sequences against read2 database...see the distribution of percentage of over laps?

ADD COMMENTlink written 8.1 years ago by Rm8.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 524 users visited in the last hour