Question: Annotating and predicting the genomic region of a sequenced phage
0
gravatar for DanielC
13 months ago by
DanielC80
Canada
DanielC80 wrote:

Dear Friends,

I have fastq files (DNA) of sequenced phage, and am trying to predict the tail proteins of the phage in the phage genome. I assembled the sequenced reads using SOPAdenovo and got contigs. Performed blast on this contigs agains the nr nucleotide database, to detect the tail proteins. However, blast result is giving me information like this:

Uncultured bacterium clone PAE-EN23_12 16S ribosomal RNA gene, partial sequence
    470     470     99%     4e-129  99%     KC238410.1

Could you please give your suggestions on:

a) Since the contigs are of phage why am I getting "bacteria" hits, should not it be phage hits?

b) I am thinking of doing blastx on the contigs got from assembly software and then look for proteins obtained from blastx in the PFAM phage tail family (http://pfam.xfam.org/family/PF06995#tabview=tab0) to identify the tail proteins. Will this approach be reasonable? I would really appreciate suggestions on how to predict the tail protein of the phage?

Thanks much, DK

ADD COMMENTlink modified 12 months ago by Biostar ♦♦ 20 • written 13 months ago by DanielC80
1

Just take your contigs and run them through some annotation software. Prokka would probably do it.

You’re likely seeing bacterial sequences because this phage is seen as a prophage in those bacterial genomes.

ADD REPLYlink written 12 months ago by jrj.healey12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1609 users visited in the last hour