Question: How to find out the insert size in RNA seq data
0
gravatar for rj.rezwan
2.9 years ago by
rj.rezwan0
rj.rezwan0 wrote:

Hi,

please tell me according to the following shared red circle in picture, can we say this as a insert size of our RNA-seq data.[1]: https://ibb.co/jf4keF

rna-seq • 4.3k views
ADD COMMENTlink modified 2.3 years ago by Michi950 • written 2.9 years ago by rj.rezwan0

As Chris suggested, you need to align your reads to get a SAM/BAM file and then run 'samtools stats' to get information about insert sizes..it will also give you an insert size distribution graph..

ADD REPLYlink modified 2.8 years ago • written 2.9 years ago by prasundutta87360
4
gravatar for ivivek_ngs
2.9 years ago by
ivivek_ngs4.9k
Seattle,WA, USA
ivivek_ngs4.9k wrote:

The easiest way to find it programmatically is to convert the SRA to fastq on the fly (if you are unable to get it from their documents) and proceed with alignment to produce the bam file. After that just use bamtools with the below command and it will give the average and the median insert size. Good luck!

bamtools stats -i foo.bam -insert
ADD COMMENTlink written 2.9 years ago by ivivek_ngs4.9k
1

changes in argument: -in

here is an example output:

$ bamtools stats -in aaaaaa.bam -insert

**********************************************
Stats for BAM file(s):
**********************************************

Total reads:       1367054
Mapped reads:      1367054      (100%)
Forward strand:    683527       (50%)
Reverse strand:    683527       (50%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
Paired-end reads:  1367054      (100%)
'Proper-pairs':    1367054      (100%)
Both pairs mapped: 1367054      (100%)
Read 1:            683527
Read 2:            683527
Singletons:        0    (0%)
Average insert size (absolute value): 104.995
Median insert size (absolute value): 80

$ bamtools --version

bamtools 2.2.2
Part of BamTools API and toolkit
Primary authors: Derek Barnett, Erik Garrison, Michael Stromberg
(c) 2009-2012 Marth Lab, Biology Dept., Boston College
ADD REPLYlink written 12 months ago by wm470

Yes, I should have updated it, thanks for putting it in the thread.

ADD REPLYlink written 12 months ago by ivivek_ngs4.9k
1
gravatar for Chris Fields
2.9 years ago by
Chris Fields2.1k
University of Illinois Urbana-Champaign
Chris Fields2.1k wrote:

No, that's the run ID, see the wikipedia article. If you have paired-end reads, you'll need to align them as such for an insert size to be reported.

ADD COMMENTlink written 2.9 years ago by Chris Fields2.1k
1
gravatar for grant.hovhannisyan
2.9 years ago by
grant.hovhannisyan2.0k wrote:

If the data comes from SRA database, then you can search that info there. If it is a newly sequenced data, than maybe easiest would be to ask sequencing provider.

ADD COMMENTlink written 2.9 years ago by grant.hovhannisyan2.0k
0
gravatar for Michi
2.3 years ago by
Michi950
Barcelona
Michi950 wrote:

also samtools has a very similar command to bamtools:

samtools stats --insert-size foo.bam

you can additionally speed upt the process by using --threads 10

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Michi950
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1141 users visited in the last hour