Question: bedcoverage output while looking for missing genes
0
gravatar for David
2.4 years ago by
David180
David180 wrote:

Hi,

I´m trying to identify missing genes in my bacterial assembly compared to my reference genome. I have run coverageBed as follows:

coverageBed  -a  Staph.genomic.gff -b  Staph_assembly.sorted.bam

I´m not sure i understand all columns so i was wondering if you could help me on identifyong the column with missing gene information and what are the columns correspond to. Thanks !!!

ACVP01000037.1  Genbank gene    1312    2169    .       +       .       ID=gene1;Name=CORTU0001_0103;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0103    0       0       858     0.0000000
ACVP01000037.1  Genbank CDS     1312    2169    .       +       0       ID=cds1;Parent=gene1;Dbxref=NCBI_GP:EET76330.1;Name=EET76330.1;gbkey=CDS;product=hypothetical protein;protein_id=EET76330.1;transl_table=11     0       0       858     0.0000000
ACVP01000037.1  Genbank gene    2228    2350    .       -       .       ID=gene2;Name=CORTU0001_0104;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0104    0       0       123     0.0000000
ACVP01000037.1  Genbank CDS     2228    2350    .       -       0       ID=cds2;Parent=gene2;Dbxref=NCBI_GP:EET76331.1;Name=EET76331.1;Note=identified by glimmer%3B putative;gbkey=CDS;product=hypothetical protein;protein_id=EET76331.1;transl_table=11      0       0       123     0.0000000
ACVP01000037.1  Genbank gene    2321    3673    .       +       .       ID=gene3;Name=CORTU0001_0106;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0106    0       0       1353    0.0000000
bedtools • 741 views
ADD COMMENTlink modified 2.4 years ago by Kevin Blighe65k • written 2.4 years ago by David180
1
gravatar for Kevin Blighe
2.4 years ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

The BEDTools documentation (here) states the following:

After each interval in A, bedtools coverage will report:

  • The number of features in B that overlapped (by at least one base pair) the A interval.
  • The number of bases in A that had non-zero coverage from features in B.
  • The length of the entry in A.
  • The fraction of bases in A that had non-zero coverage from features in B.

So, you have to look to the right-most part of your output, specifically the final 4 columns. In the small amount of data that you have pasted, I can therefore see that no BAM reads overlapped these exons that were in your input GFF.

Note that you may also be interested in the -hist parameter. Take a look at the dicumentation for this and more information.

Kevin

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Kevin Blighe65k

Thanks Kevin, I went trough the bedtools but didn´t find clearly that it was the last 4 columns. Thanks for letting me know.

ADD REPLYlink written 2.4 years ago by David180

No problem David - best of luck.

ADD REPLYlink written 2.4 years ago by Kevin Blighe65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1961 users visited in the last hour