Question: extract list of positions from fasta file biopython
gravatar for s.i.lipworth
2.8 years ago by
s.i.lipworth0 wrote:

I have a list of positions of interest eg:


I want to extract the base call at these positions from a fasta file using biopython. This is what I have tried:

query_dic ={}
with open(line) as pos_file:
                for x in pos_file:
                        for seq_record in SeqIO.parse(query_file, "fasta"):
                                nuc = seq_record[x] 
The error message says 'invalid index' - what is wrong?
biopython python • 1.4k views
ADD COMMENTlink written 2.8 years ago by s.i.lipworth0


  1. read the positions as list
  2. iterate FASTA records:

    for seq_record in SeqIO.parse(query_file, "fasta"):
           for x in positions:
                  # get the base at position x
ADD REPLYlink written 2.8 years ago by shenwei3565.1k

Firstly, you should get the right Chromosome; then extract the base from fasta sequence.

ADD REPLYlink written 2.8 years ago by Ben50

Does you FASTA file have one sequence in it, or many?

If one, you only need to open the FASTA file once, and you should use for that.

If many, you need to know which sequence each of the values x refers to. Perhaps SeqIO.index would be useful here for loading the relevant record from a multiple sequence FASTA file?

ADD REPLYlink written 2.6 years ago by Peter5.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1211 users visited in the last hour