Question: Annotating and predicting the genomic region of a sequenced phage
gravatar for DanielC
2.1 years ago by
DanielC120 wrote:

Dear Friends,

I have fastq files (DNA) of sequenced phage, and am trying to predict the tail proteins of the phage in the phage genome. I assembled the sequenced reads using SOPAdenovo and got contigs. Performed blast on this contigs agains the nr nucleotide database, to detect the tail proteins. However, blast result is giving me information like this:

Uncultured bacterium clone PAE-EN23_12 16S ribosomal RNA gene, partial sequence
    470     470     99%     4e-129  99%     KC238410.1

Could you please give your suggestions on:

a) Since the contigs are of phage why am I getting "bacteria" hits, should not it be phage hits?

b) I am thinking of doing blastx on the contigs got from assembly software and then look for proteins obtained from blastx in the PFAM phage tail family ( to identify the tail proteins. Will this approach be reasonable? I would really appreciate suggestions on how to predict the tail protein of the phage?

Thanks much, DK

ADD COMMENTlink modified 24 months ago by Biostar ♦♦ 20 • written 2.1 years ago by DanielC120

Just take your contigs and run them through some annotation software. Prokka would probably do it.

You’re likely seeing bacterial sequences because this phage is seen as a prophage in those bacterial genomes.

ADD REPLYlink written 24 months ago by Joe16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1163 users visited in the last hour