bedcoverage output while looking for missing genes
1
0
Entering edit mode
6.0 years ago
David ▴ 230

Hi,

I´m trying to identify missing genes in my bacterial assembly compared to my reference genome. I have run coverageBed as follows:

coverageBed  -a  Staph.genomic.gff -b  Staph_assembly.sorted.bam

I´m not sure i understand all columns so i was wondering if you could help me on identifyong the column with missing gene information and what are the columns correspond to. Thanks !!!

ACVP01000037.1  Genbank gene    1312    2169    .       +       .       ID=gene1;Name=CORTU0001_0103;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0103    0       0       858     0.0000000
ACVP01000037.1  Genbank CDS     1312    2169    .       +       0       ID=cds1;Parent=gene1;Dbxref=NCBI_GP:EET76330.1;Name=EET76330.1;gbkey=CDS;product=hypothetical protein;protein_id=EET76330.1;transl_table=11     0       0       858     0.0000000
ACVP01000037.1  Genbank gene    2228    2350    .       -       .       ID=gene2;Name=CORTU0001_0104;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0104    0       0       123     0.0000000
ACVP01000037.1  Genbank CDS     2228    2350    .       -       0       ID=cds2;Parent=gene2;Dbxref=NCBI_GP:EET76331.1;Name=EET76331.1;Note=identified by glimmer%3B putative;gbkey=CDS;product=hypothetical protein;protein_id=EET76331.1;transl_table=11      0       0       123     0.0000000
ACVP01000037.1  Genbank gene    2321    3673    .       +       .       ID=gene3;Name=CORTU0001_0106;gbkey=Gene;gene_biotype=protein_coding;locus_tag=CORTU0001_0106    0       0       1353    0.0000000
bedtools • 1.5k views
ADD COMMENT
1
Entering edit mode
6.0 years ago

The BEDTools documentation (here) states the following:

After each interval in A, bedtools coverage will report:

  • The number of features in B that overlapped (by at least one base pair) the A interval.
  • The number of bases in A that had non-zero coverage from features in B.
  • The length of the entry in A.
  • The fraction of bases in A that had non-zero coverage from features in B.

So, you have to look to the right-most part of your output, specifically the final 4 columns. In the small amount of data that you have pasted, I can therefore see that no BAM reads overlapped these exons that were in your input GFF.

Note that you may also be interested in the -hist parameter. Take a look at the dicumentation for this and more information.

Kevin

ADD COMMENT
0
Entering edit mode

Thanks Kevin, I went trough the bedtools but didn´t find clearly that it was the last 4 columns. Thanks for letting me know.

ADD REPLY
0
Entering edit mode

No problem David - best of luck.

ADD REPLY

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6