Question: How to identify the strain in reads file if genus-specie is known?
0
gravatar for muhammad_elhossary
19 days ago by
muhammad_elhossary0 wrote:

Hi folks,

I am relativly new to bioinformatics. I have got some reads files for analysis, and these reads (fastq) actually was for a short reads sequenced sample which contain a pool of multiple bactrial species (roughly 10). all together went to the sequencer. They all belong to the same biological family, but I have the list of Genus-Specie was in that sample. But I don't know the specific strains for each.

I need to accuratly identify the strain/substr of each specie, regardless of the computation power needed to do that. example: I know that their is an e.coli in the sample but i don't know which strain? is it K12 or O157 or something else?

How can I do that? what approach to go with? and by what tools?

Thanks Best regards

ADD COMMENTlink modified 19 days ago • written 19 days ago by muhammad_elhossary0

If you know the 10 genomes you are working with you can try bbsplit.sh to bin your reads into genome specific pools. bbsplit.sh is part of BBMap suite.

ADD REPLYlink written 19 days ago by genomax70k

Thanks for your fast reply, As I see (correct me if I am wrong), This solution can be a second step to split the reads pool by specie. If I did it as a first step to all the strains for each specie in the sample I will end up having a numerous fastq files and still don't know which is the best matching strain.

I need for example: I know that their is an e.coli in the sample but i don't know which strain? is it K12 or O157 or something else? this is only for one specie. Thanks again

ADD REPLYlink modified 19 days ago • written 19 days ago by muhammad_elhossary0

Strain level identification with short reads like illumina is always going to be a challenge. If you started this experiment with a defined set of genomes then it would be fine. If you did not know if particular strain was K12 or O157 to begin with, then the best you may be able to do is to say it is E. coli.

ADD REPLYlink modified 19 days ago • written 19 days ago by genomax70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1670 users visited in the last hour