Entering edit mode
3.3 years ago
bioAddict
•
0
Hi everyone,
I wanna perform a Viral metagenomics study of transcriptomics data. So I need to extract the unmapped reads and assemble them. My questions are :
1- do this script of samtools suit my purpose :
samtools view -b -@ 8 -f12 sample.bam > sample_unmapped.bam
2- About the assembly, do you have a suggestion for me, please?
3- For the annotation, is blast database suitable?
Thanks in advance
For de novo assembly, megahit and MetaviralSPAdes are good options. I am not sure what you mean by annotation here - if it is simply to identify what the contigs are matching to then DIAMOND (i.e. BLASTX) or HHBlits are recommended as RNA viruses are likely to be very divergent at the nucleotide level but should be identifiable at protein/amino acid level comparisons.
Thank you for availability. My main issue is about the unmapped reads extraction. I need to be sure that I retrieve all of them. So do you think that my script with samtools is suitable? For de novo assembly, what do you think about trinity?
IMO metagenomic assemblers typically perform better with metaviromics data but it's a personal choice.
you can have a look into this paper that have plenty of information: https://www.nature.com/articles/srep23774