Question: (Closed) IndexError: list index out of range
0
flogin • 250 wrote:
Hello guys, I have that file:
Input Data File: GSTE1_EXON.AB.02.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.AB.02.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's test
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.AB.02.fas
Total number of mutations, Eta: 0
Input Data File: GSTE1_EXON.AR.01.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.AR.01.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Fu and Li's
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.AR.01.fas
Total number of mutations, Eta: 0
Input Data File: GSTE1_EXON.CA.02.fas
Variable (polymorphic) sites: 0 (Total number of mutations: 0)
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
Theta (per site) from S, Theta-W: 0,0000000000
Variance of theta (no recombination): 0,0000000
Variance of theta (free recombination): 0,0000000
Average number of nucleotide differences, k: 0,000
Theta (per sequence) from S, Theta-W: 0,000
Variance of theta (no recombination): 0,000
Variance of theta (free recombination): 0,000
Input Data File: GSTE1_EXON.CA.02.fas
Number of pairwise comparisons: 0
Number of significant pairwise comparisons by Fisher's exact test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Number of significant pairwise comparisons by chi-square test: 0
Number of significant comparisons using the Bonferroni procedure: 0
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
Fu and Li's D* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Fu and Li's F* test statistic: 0,00000
Statistical significance: Not significant, P > 0.10
Input Data File: GSTE1_EXON.CA.02.fas
Total number of mutations, Eta: 0
And I create this script:
# -*- coding: utf-8 -*-
import pandas as pd # to convert the list of dictionaries to a data frame
file = open("GSTE1.txt.add","r")
# creating a list of terms that I want
keys_order = ["Input Data File","Average number of nucleotide differences, k","Total number of mutations, Eta","Theta (per sequence) from S, Theta-W","Theta (per site) from S, Theta-W","Variance of theta (no recombination)","Variance of theta (free recombination)","Theta (per sequence) from S, Theta-W","Fu and Li's D* test statistic, FLD*","Fu and Li's F* test statistic, FLF*","Fu and Li's D* test statistic","Fu and Li's F* test statistic","Number of pairwise comparisons","Number of significant pairwise comparisons by Fisher's exact test","Number of significant pairwise comparisons by chi-square test","Number of significant comparisons using the Bonferroni procedure"]
dic = {} #creating a dictionary to use in first list
dictio = {} #creating a dicitonary to use in second list
big_list = [] # creating a list to put dicionaries in a first search
list_dictio = [] # creating a list to put dictionaries in a second search
aux = ""
for line in file:
if line.strip().split(":")[0] == "Input Data File":
atrib = line.strip().split(":")[1]
if atrib == aux:
dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
else:
big_list.append(dic)
aux = atrib
dic = {}
for fasta in big_list:
for i in keys_order:
if i in fasta:
dictio[i]=fasta[i]
else:
dictio[i] ='-'
list_dictio.append(dictio)
dictio = dictio.fromkeys(dictio,0)
table = pd.DataFrame.from_records(list_dictio)
export_csv = table.to_csv(r'/home/user/Dropbox/jupyter/output.csv', index = None, header=True)
The output:
Traceback (most recent call last):
File "DNAsp2.py", line 16, in <module>
dic[line.strip().split(":")[0]] = line.strip().split(":")[1].strip()
IndexError: list index out of range
What doesn't make sense for me is that I run this script for other files and everything runs fine.
Can anyone help me?
Just debug....
You print the line and you check how many ":" characters you see
Many lines have more than one ":", the problem is: I run that script for eight other files with the same structure, and everything runs ok.
yes, so that means that in one file the structure is slightly different. To know where that difference is you need to print the line before the error. The error is caused by this line:
And this is because the index is out of range, the index is determined by how many ":" characters are present. So you add
print(line)
here:If you run the script you will see a lot of output, these are the lines of the input file. When you get the error you need to check the last printed line. You can also compare it with the second last printed line
The output is:
Hello flogin!
We believe that this post does not fit the main topic of this site.
gb help and the script runs well
For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.
If you disagree please tell us why in a reply below, we'll be happy to talk about it.
Cheers!