Question: Predicted Proteins: Mapping Reads to ORFs
gravatar for Longshotx
6 weeks ago by
Longshotx 20
Longshotx 20 wrote:

Hi All - I have some soil metagenomes from a number of samples. I assembled the metagenomes and predicted ORFs using Prodigal, then I used HMMSCAN and some additional tools to scan the ORFs against a custom protein database and look for antibiotic-like ORFs. I ran a blastp search against the nr database using diamond and annotated these antibiotic like ORFs using Megan 6 to get the bacterial taxonomies.

Question - I would like to determine which bacterial hosts contain these antibiotic like ORFs, and determine the number of sequencing reads that map to the antibiotic ORFs (like a coverage matrix for all the samples and predicted proteins). I thought I could map the original sequencing reads to the annotated antibiotic ORFs using bowtie2 but realized it is not equipped to map against protein sequences.

I'm looking for suggestions for the best approach to my particular study. Thanks for your input!

ADD COMMENTlink modified 5 weeks ago by colindaven1.7k • written 6 weeks ago by Longshotx 20
gravatar for colindaven
5 weeks ago by
Hannover Medical School
colindaven1.7k wrote:

There are amino acid to DNA mappers out there (eg protein2genome + one or two others) .

However, why not just fish your ORFs which are DNA originally (did you keep the same headers ?) which have BLAST hits and use these as reference sequences? You could also re-blast using blastx vs nr instead of blastp.

You'll have fun with antibiotic protein families which are mis and coassembled though.

ADD COMMENTlink written 5 weeks ago by colindaven1.7k

Thank you! That was very helpful.

ADD REPLYlink written 27 days ago by Longshotx 20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 707 users visited in the last hour