Is there any way to extract longest ORF from blastx output?
2
1
Entering edit mode
5.7 years ago
sohra ▴ 40

Hi all,

I read in a paper, the longest ORF in the reading frame indicated by the blastx analysis was determined, then resulting CDS extracted and also UTR regions were removed. Could anybody please let me know how to determine longest ORF using blastx results and find the CDS and UTR on them?

Thanks

sequence alignment blast CDS UTR • 1.7k views
ADD COMMENT
0
Entering edit mode

what is the reference of that paper ?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
5.6 years ago
x.jack.min ▴ 20

http://proteomics.ysu.edu/tools/OrfPredictor.html

will do the work for you

ADD COMMENT
0
Entering edit mode
5.6 years ago
5heikki 9.8k

I don't think it's possible to detect the longest possible ORF from blastx output, only the longest aligned region (although probably in most cases the latter is part of the former). Below 1) sort by query id; 2) sort by alignment length (tabular output assumed). Note that only the longest hit per contig is considered so this strategy is not that sensible for all data (e.g. contigs that are expected to include introns and or intergenic regions between CDS). If you're fine with this, you can output the translated region into a column (check blastx -help), and then parse it from there..

LC_ALL=C; export LANG=C; sort -k1,1 -k4,4gr tabularBlastxOutput | sort -u -k1,1 --merge > longestAlignedRegions
ADD COMMENT

Login before adding your answer.

Traffic: 2019 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6