Question: to calculate coverage of each contig after assembly
0
gravatar for singh.jyotika
7 weeks ago by
singh.jyotika0 wrote:

I have the read file and assembled contigs (both in .fasta format). I need to calculate the coverage for each single contigs across the read file. I tried doing (1) indexing with bowtie (2) alignment with both with bowtie-align and tophat2. its giving me the error "Splice sequence indexing failed with err =1". Kindly help me how to proceed with the coverage calculation of each contig.

alignment • 125 views
ADD COMMENTlink modified 7 weeks ago by Shyam130 • written 7 weeks ago by singh.jyotika0

Hi Jyotika,

[1] I have the read file and assembled contigs (both in .fasta format)

Reads in fasta? Are you sure its not fastq rather?

[2] I need to calculate the coverage for each single contigs across the read file

This does not make sense . What do you mean by across the read file? Elaborate.

[3] Splice sequence indexing failed with err =1

You need to tell us the exact commands you ran.

Thanks

Vijay

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by lakhujanivijay4.7k

Command that I run was tophat2 -r 20 454.10species.fasta MC55.MG10.AS1.C1.fasta

mine is a metagenome data. I need to get the coverage/depth of every single contig in my .fasta file. The file MC55.MG10.AS1.C1.fasta is having only one sequence, likewise I have 10000 contigs for which I need the coverage/depth. 454.10species.fasta is my metagenome file file after sequencing.

ADD REPLYlink written 7 weeks ago by singh.jyotika0
2

I am not sure why you are using tophat for this analysis. It may be simpler to use bwa or bbmap (from BBMap suite) and get alignments of your reads against the assembly. You would probably want to place multi-mapping reads in a single random location. Finally follow that up by using mosdepth (download, to get single base level coverage) or samtools idxstats analysis to get counts per contig.

Note: Having a single fasta sequence per file is going to make this ridiculously clumsy. Consider concatenating original reads in a single multi-fasta file. If you have original fastq format reads available then I would rather use those.

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by genomax78k
0
gravatar for Shyam
7 weeks ago by
Shyam130
United States
Shyam130 wrote:

Bowtie and Tophat are short read aligners. Though tophat support longer reads like from 454 it has a limit of 1024 bases. BWA-MEM is a better option for alignment and you can get the coverage stats using samtools. I assume the 454.10species.fasta is the read file. You need to us the multi-fasta file with all the contigs to make your life easier. You can concatenate all the contig fasta files in to one and run the alignment.

What is your bowtie index name. I think you gave the read file as index and contig file as the read file. I do not understand using the option -r for your data!.

ADD COMMENTlink written 7 weeks ago by Shyam130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1185 users visited in the last hour