using max_target 1,max_hsps 1 for blastn and subsequent compilation of utr files
1
0
Entering edit mode
6.3 years ago

1). i performed a blastn for a list of  mouse genes against the gene database and obtained an output consisting of several thousand hit, i needed only the best hit , which was first on the tabular output

so i tried restricting the result using max_target_seq 1 and max_hsps 1

i got a single hit result. but when i compared it to the unrestricted files i find that the results or the hit shown have different table values such as align.lemgth and %identity although the transcript itself is the same

can anyone suggest a reason for this and any solutions

2).i used the original to get the first hits for the above

so i have a list of transcript that i obtained from performing a blastn execution. and i have to pull out the corresponding sequences from a source file of 3utr/5utr. but there are a few hundred transcripts and performing it manually is time consuming and error prone

can anyone suggest a way or command line to perform it using terminal in linux

In summary a method to selectively pull out certain sequence from the gen database file using the ensembl transcript as identifying aspect

blast utr max_target_seq max_hsps • 2.6k views
0
Entering edit mode
6.3 years ago
mxs ▴ 530

Hi,

Well, the reason you get different alignment and % identity values is because you have restricted the number of HSPs per hit to 1. Whether you really want to do this or not it depends on what you are doing afterwords with your results. As far as the second question goes, a quick solution would be something like this :

cat blastout | perl -lne '/^(.*?)\t/; $hash{$1} = 1}{ open(IN,"<","seqfile.fa"); while(<IN>){chomp;if(/>(.*)/){$l=($hash{$1} == 1)?(1):(0)}; print "$_" if \$l == 1}' > output.fa

but it would be better if you provided a small example.  blastout is your tabular output of the blastn search and seqfile.fa is your  fasta formatted sequence file with >ID being the header. But as i said it would be better if you could provide some sample files.

m