Question: Blast many read sequences
0
gravatar for zoegward
17 days ago by
zoegward40
zoegward40 wrote:

Hi, I have a fastq file which I think contains sequences from different organisms. Is there a way I can blast all of the sequences in the fastq file to find out where what these organisms are??

Thanks in advance.

ADD COMMENTlink modified 16 days ago by Vijay Lakhujani2.5k • written 17 days ago by zoegward40

I am not able to understand what you are trying to do because your question is not explain properly but I can explain you the steps you can do.

Convert .fastq to .fasta

sed -n '1~4s/^@/>/p;2~4p' file.fq > file.fa

Use BLAST Command Line Application for fasta file

manual

ADD REPLYlink modified 16 days ago by Vijay Lakhujani2.5k • written 17 days ago by MSM5570

I have added/removed tags to keep the post relevant

ADD REPLYlink written 16 days ago by Vijay Lakhujani2.5k
3
gravatar for Vijay Lakhujani
16 days ago by
Vijay Lakhujani2.5k
India
Vijay Lakhujani2.5k wrote:

Blasting reads sounds like a bad idea considering small lengths and the number of reads. May be you can shuffle few thousand reads (seqkit?) and then try it, however, I will suggest using fastq-screen to map reads on the genomes of organims that you suspect to be present in your raw data.

ADD COMMENTlink written 16 days ago by Vijay Lakhujani2.5k

+1 for "fastq-screen"

ADD REPLYlink written 16 days ago by mbk0asis350
3
gravatar for shenwei356
16 days ago by
shenwei3563.8k
China
shenwei3563.8k wrote:

You need taxonomic profiling softwares, like Kraken and Kaiju . BLAST is the slowest for this kind of task.

ADD COMMENTlink written 16 days ago by shenwei3563.8k

+1 for that. That ll surely help OP

ADD REPLYlink written 16 days ago by Vijay Lakhujani2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 961 users visited in the last hour