Determining Reading Frame from Nucleotide Blast tabular results
1
0
Entering edit mode
8.8 years ago
zgayk ▴ 90

Hello,

I have a database of blast matches to assembly scaffolds in the format of a 24 column tabular file. Included in the results are columns for query and subject frames. However, all of the reading frame results came back as +1, even though this does not seem to be the case when I selectively blast particular results from the file against NCBI blast. Does anyone know why the reading frame results are essentially useless in this regard?

In addition, I would like to get information on whether each match was + or -, as it seems that a number of query sequences are template strands and I may need to get the reverse compliment to be able to use the blast information in analyses.

The data file is very large, but I would be happy to send it to anyone who wanted to see.

Thanks,
Zach

blast • 3.2k views
ADD COMMENT
0
Entering edit mode

you should include a few lines of your tabular file (or the formatting string) - the way you format the blast output affects what gets reported.

ADD REPLY
0
Entering edit mode

Here are the first 36 blast results from the table, not including the actual aligned sequences, which would be too long. The query frame is at the very right of the output.

ADD REPLY
0
Entering edit mode
8.8 years ago

Well it is not clear how you got that file as that does not seem to be a standard blast output. Hence there is not much advice we can give you on why the information that you seek is not there. It looks like the product of a custom script.

In general to get the strand information as a column you would need to specify the sstrand field to the tabular output. From your file you may still get that information by looking at the start/end coordinates - when start > end it (probably) means that the alignment is on the reverse strand.

As for frames, these only matter when you are using blastx and tblastx where the alignment uses translated bases. Your example seems to be a nucleotide level alignment where the frames will always be in +1 frame.

ADD COMMENT
0
Entering edit mode

The output was created using Galaxy, as extended 24-column Blast tabular data. Thank you very much for the information about the start and end coordinates. From that I was able to identify all the reverse strands.

I understand that the nucleotide alignment will always be in the plus1 frame, but I was hoping that I could get information directly on the frame of the amino acids, assuming all nucleotide results in my file are protein-coding.

Thanks very much,
Zach Gayk

ADD REPLY

Login before adding your answer.

Traffic: 2722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6