Is there a way to do it? Sorry for the uninformative question, so I have downloaded an SRA file from NCBI and used included sratoolkit to split the file into two fastq sequences. I am trying to do a de novo assembly using these paired-end strand_specific reads. However, a required parameter is the average insert size. Does anyone know how to obtain this from an SRA file or fastq?
Guessing an insert size length, assembling, mapping to the assembly, and then iterating with the improved insert size length (from the mappings) is a reasonable choice, and probably about the best you can do. You hopefully should have some rough idea from the library preparation method (size selection criteria or if it's jumping library or not).
In fact, Velvet does this automatically (from the 1.1 manual): "If the insert length of a library is unspeciﬁed, Velvet will attempt to measure it for you, based on the read-pairs which happen to map onto a common node." As they point out, it's critical to check the reported estimate to make sure it's sane.
I'm going to suggest a lazy, imperfect solution. If this is illumina (Genome Analyzer, HiSeq etc.) then th insert size is normally about 300bp. If your assembler isn't too sensitive to that parameter, try 300bp as a reasonable guess.