retrieve amplicon sequences from fna file
1
0
Entering edit mode
16 months ago
Lila M ★ 1.2k

Hi all,

I would like to get the amplicon sequence from a indexed fna file. I do have the start/end coordinates for the amplicons and also the primer sequences. I've been reading about samtools faidx but it is not very clear to me how to do it. Does anyone here have any experience in this particular issue?

Thank you!

amplicon fna primers • 1.4k views
ADD COMMENT
0
Entering edit mode

samtools faidx simply fetches sequences using coordinates. It can't find sequences in a file that match a certain sequence. You will need to first align the data and get the coordinates you need before fetching them.

If I am mistaken then perhaps provide some additional detail about what you have.

ADD REPLY
3
Entering edit mode
16 months ago

You say you have the coordinates, then you would need to do samtools faidx <file> <CONTIG_NAME>:<start>-<end>. All < > blocks should be replaced with the information you have.

Next time you want to ask a question try to identify what it is that you don't understand. This will help you find better results and it will help potential answerers give you precise solutions.

Two extra tips:

  • Find code examples on GitHub with advanced search you can filter shell scripts that contain your command of interest. See https://github.com/search?l=Shell&q=%22samtools+faidx%22&type=Code for examples relevant to your inquiry.
  • I'm not sure how you came about the coordinates for your amplicon this time so I might be mistaken to suggest additional software and instead you could check the program you are already using. Most software that can find the locations could also extract the amplicon. Consider using something like seqkit amplicon instead.
ADD COMMENT
0
Entering edit mode

Thank you so much for your feedback.

The main issue I do have is that I am not sure if this tool is the correct one to solve my problem, or if there is any other better approach.

As I said, I have the human fna and fna.fai files. There is a variant there that I know where is it, and I do have the primers and the start/end information for the specific amplicon. What I want/need to do , is to extract the fasta sequence for that amplicon using the information I do have.

I'm not sure if this is more clear now (I hope!)

Thank you again

ADD REPLY
1
Entering edit mode

I have the human fna and fna.fai files. There is a variant there that I know where is it, and I do have the primers and the start/end information for the specific amplicon. What I want/need to do , is to extract the fasta sequence for that amplicon using the information I do have.

If you only have a fasta file then use seqkit amplicon (LINK) tool recommended bu @Juan above. Once you confirm that this does what you need I can move the @Juan's comment to an answer.

samtools faidx is not the correct tool.

ADD REPLY
0
Entering edit mode

Hello, well samtools seems to do the job!

samtools faidx hg38.fna  chrN:amplicon_start-amplicon_end 

Why you think is not correct?

I am not able to use seqkit as it is not installed in the environment .... Thank you for your help :)

ADD REPLY
2
Entering edit mode

If you know the exact location coordinates of the amplicon then samtools faidx absolutely will work.

Since you were mentioning primers, we thought that the exact location was in doubt. For this case seqkit is the right tool.

ADD REPLY
0
Entering edit mode

This is great, thank you so much for the clarification!

ADD REPLY
1
Entering edit mode

I am not able to use seqkit as it is not installed in the environment ...

Then just install it, via conda install -c bioconda seqkit or just download the binary from the github release page.

ADD REPLY

Login before adding your answer.

Traffic: 1537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6