paired end metagenomics analysis
0
0
Entering edit mode
16 months ago
serene.s • 0

Hello, I am Saraswati and I am new in the field of metagenomics. I have to do taxonomic classification(Archea, Bacteria, Eukaryotes and Viruses) of the shotgun sequence which is 3.5 GB in size by using tools DIAMOND and MEGAN6.

I downloaded the .sra file from ncbi and splitted into two files, forward and reverse .fastq format using sra toolkit. Now I have to do blastx of the sequence using DIAMOND but I want to know if I should merge forward and reverse reads into a single file or I should do blastx with forward read only? If I should merge then which tool I should use for that purpose?

I want to do blastx against nr database I tried doing it with the forward end .fasta file but it is taking a long time(more than 7 days and still continuing) so how should I make it faster? Also if there are some other tools or strategy to the same for taxonomy in short time and better way then please suggest me.

sequence genome alignment metagenome shotgun • 1.0k views
ADD COMMENT
1
Entering edit mode

Yes, you should assemble paired-end reads (assuming they overlap) with any paired-end assembler tool, such as PEAR (this is just one among many tools available - paper) before trying to annotate them. There are software written to annotate taxonomically metagenomes such as kraken2 (site) and centrifuge (site) among others. You can, of course depending on your objective, try to assemble the metagenomes into genomes (the so called MAGs) annotating this after.

If you're blasting locally, you can provide -num_threads (depending on the threads that your computer/server has) to parallelize the work. If you're blasting remotely this is not possible as far as I know. Blasting entire fastq files without performing any kind of clustering etc, it will take some time, assuming that each file just have a few million of reads.

ADD REPLY
1
Entering edit mode

IDseq is also a good option for taxonomic assignment.

ADD REPLY
1
Entering edit mode

Of several issues in uploading data and analysis on IDSeq server, this should be enough to scare people from IDseq privacy statement on metadata of the data uploaded by user (copy/pasted from https://chanzuckerberg.zendesk.com/hc/en-us/articles/360058195412-Preview-of-IDseq-s-New-Privacy-Policy-Terms-of-Service-Effective-April-1-2021-):

This data is also shared with technical partners (Chan Zuckerberg Initiative, LLC - CZI LLC) and Service Providers (ex: AWS) that help operate and secure IDseq.
ADD REPLY
0
Entering edit mode

Use ASAIM protocol which uses OSS tools like metaphlan3. MEGAN 6 is dual licensed (AFAIK).

ADD REPLY

Login before adding your answer.

Traffic: 2050 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6