How to obtain read count in the output file in ncbi-blast+ standalone application
1
0
Entering edit mode
8 weeks ago
khq5801 ▴ 10

I used the below-mentioned command using ncbi-blast+ in order to find the conserved miRNA. I have created the miRbase database and ran the below-mentioned command. I needed read_count and description of the aligned sequence in the output file. I will really appreciate if you could kindly provide your suggestion in this regard.

blastn -db <db_source> -query <query_source> -out <outfile> -outfmt "6 qseqid sseqid slen qstart qend length mismatch gapopen gaps sseq" -word_size 4 -perc_identity 100

miRBase NGS miRNA ncbi-blast • 414 views
0
Entering edit mode

Blast is not the right application to find miRNA. You are probably working at the limit of sensitivity. You may be much better off using a NGS aligner line bowtie v.1.x and/or a proper miRNA analysis pipeline.

0
Entering edit mode

Hi..I have used bowtie and aligned my NGS sample with reference genome, then I separated the mapped sequences and started doing ncbi-blast+ with miRBBase database of mature miRNA to detect the conserved miRNA. I have created the miRBase database locally.

0
Entering edit mode

I am not sure why you switched to blast. You can still stay with bowtie and align your data to miRBase (after creating a proper index, you will need to substitute U with T before making the index..

1
Entering edit mode
8 weeks ago

Not sure what you are asking; you already have the query ids and subject ids. So the answer seems to be all there.

If each alignment means a read that aligns then count how many times does a subjectid appear. Then it is

cat blast.results | cut -f 2 | sort | uniq -c > readcounts.txt


it will produce how many times does each subject appear, which seems to be what you want.

To associate a subject id with a name run blastdbcm and format the output to contain "%a %t" if I recall correctly.