Predicted Proteins: Mapping Reads to ORFs
1
0
Entering edit mode
4.7 years ago
Longshotx ▴ 70

Hi All - I have some soil metagenomes from a number of samples. I assembled the metagenomes and predicted ORFs using Prodigal, then I used HMMSCAN and some additional tools to scan the ORFs against a custom protein database and look for antibiotic-like ORFs. I ran a blastp search against the nr database using diamond and annotated these antibiotic like ORFs using Megan 6 to get the bacterial taxonomies.

Question - I would like to determine which bacterial hosts contain these antibiotic like ORFs, and determine the number of sequencing reads that map to the antibiotic ORFs (like a coverage matrix for all the samples and predicted proteins). I thought I could map the original sequencing reads to the annotated antibiotic ORFs using bowtie2 but realized it is not equipped to map against protein sequences.

I'm looking for suggestions for the best approach to my particular study. Thanks for your input!

prodigal mapping coverage counts Assembly • 1.1k views
ADD COMMENT
1
Entering edit mode
4.7 years ago

There are amino acid to DNA mappers out there (eg protein2genome + one or two others) .

However, why not just fish your ORFs which are DNA originally (did you keep the same headers ?) which have BLAST hits and use these as reference sequences? You could also re-blast using blastx vs nr instead of blastp.

You'll have fun with antibiotic protein families which are mis and coassembled though.

ADD COMMENT
0
Entering edit mode

Thank you! That was very helpful.

ADD REPLY

Login before adding your answer.

Traffic: 2203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6