I work with Metatranscriptomics data(sequenced using Illumina technolgy).I did de-novo transcriptome assembly using SOAP-Denovo-Trans assembler and now looking for a tool/software that could help me out to find the total number of overlapped reads involved to form a single contig to understand how good or bad the coverage is.

Any suggestions could be helpful.

Thank you in advance.

If you have your contigs and reads in sorted BED format (e.g., myContigs.bed and myReads.bed) and you know your overlap criteria, then you could use BEDOPS bedmap to answer that question:

$ bedmap --echo --count <overlap-options> myContigs.bed myReads.bed > myContigsWithCountOfOverlappingReads.bed

If you leave out <overlap-options> then the default overlap between read and contig is one base. Otherwise, you can specify number of bases of overlap between files with --bp-ovr or require a fraction of contig or read length with --fraction-ref and --fraction-map respectively. Other overlap options are also available. This is discussed more fully in the BEDOPS documentation.

Thank you for the comment Alex.

the problem is that the data I work with is in Fasta format,is there an option to convert fasta file to the format that could be acceptable by BEDOPS like BAM/SAM.Does it works?

It will depend on your Fasta file and whether it already contains coordinate and chromosome information (in the header, for instance). If not, you'll need to align your sequences to a reference genome to turn into BAM, SAM or PSL, and convert from there into BED with a conversion script (such as those in BEDOPS).

No actually the header line has only the sequence ID,and for most of the organisms doesn't exist any reference genome yet so I guess I can't do, even this is the reason I did de-novo transcriptome assembly

