Estimated Insert Size Less Than Read Pair Sizes (180Bp For 100X100 Pe), But Seqprep Merged 0 Reads?
1
1
Entering edit mode
12.0 years ago
Ian Fiddes ▴ 70

I have some RNAseq that was done by a one-stop shop type company, aka RNA was submitted and then analyzed data as well as fastq files were returned. I am trying to run tophat/cufflinks analysis myself to compare. I used Bowtie Picard tools to empirically determine the insert size (following http://vinaykmittal.blogspot.com/2012/02/how-to-estimate-insert-size-for-paired.html) and the result was 180bp mean for both samples. I used a basic mrna.fa file from UCSC as my reference for bowtie.

However, the sequencing is 100x100 paired end, so I figured for some strange reason they did overlapping reads. So I ran the program SeqPrep to try and merge the reads, and it came up with exactly zero mergeable pairs. Does anyone know why this would be? The histogram generated by CollectInsertSizeMetrics.jar can be seen here: http://i.imgur.com/UxjoE.png

rna-seq • 3.6k views
ADD COMMENT
0
Entering edit mode
11.7 years ago
Rm 8.3k

Overlapping RNA-seq reads is pretty common: How are the QC's for these reads? If very bad, you wont expect many overlaps: but i am surprised to see zero merged reads?:

One test I suggest: Take a 100,000 paired reads and convert them to fasta and then using megablast search read1 sequences against read2 database...see the distribution of percentage of over laps?

ADD COMMENT

Login before adding your answer.

Traffic: 1505 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6