Help with homology searching against nr or nt databases using blat
0
0
Entering edit mode
8.8 years ago
seta ★ 1.9k

Hi all,

I'm working on a plant RNA-seq analysis, I plan to check my assembly against whole nr or nt databases to detect any common contamination like Homo sapiens and Escherichia coli DNA, mitochondrial and chloroplast sequences as well as rRNA. As you all know blasting against nr or nt takes too much time, so I prefer to use blat. Please put here your experience about using blat to this end as I did not find much information for this purpose, I knew UCSC, but I'm looking for your command to make database and run it.

Thanks

Assembly alignment sequencing RNA-Seq • 1.6k views
ADD COMMENT
0
Entering edit mode

Why don't you make a database that contains only common contaminants? You can also make a custom database with the plant genome or related plant genomes, and then only blast the remainder of contigs against nt

ADD REPLY
0
Entering edit mode

What's your proposed way to collect above-mentioned common contaminants to make database?

ADD REPLY
0
Entering edit mode

With normally > 90-95% reads that map to the reference, in which I include the mitochondrial genome, I don't bother much. If there is contamination, it is mostly from the host genome (as expected), so I check - only the remainder of reads- against the salmon genome, which normally gives good coverage. There is also sometimes a small percentage of phage sequences for sequencing error assessment in the samples. I would also include the chloroplast genome in your case, I don't see plastid genomes as contamination. I haven't checked for human sequences, but I guess they work sterile in our lab.

ADD REPLY

Login before adding your answer.

Traffic: 1998 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6