Index error in python
0
0
Entering edit mode
17 months ago

Dear all,

When I run a Python script, it reported an error like this:

Traceback (most recent call last):
  File "replace_gff_gene_id.py", line 7, in <module>
    ab[lst1[0]] = lst1[1]
IndexError: list index out of range

The whole script is like this:

python replace_gff_gene_id.py Prunella_fulvescens_BGI_coress.seqid Prunella_fulvescens.homolog.gff > Prunella_fulvescens.modify.gff3

Also attached script replace_gff_gene_id.py here:

import sys

ab = {}
with open(sys.argv[1]) as  rename_file:
    for row in rename_file:
        lst1 = row.strip().split('\t')
        ab[lst1[0]] = lst1[1]
#print(ab)

with open(sys.argv[2]) as gff_file:
    for row in gff_file:
        for key,value in ab.items():
            row = row.replace(key,value).strip()
        print(row)

Here is an example for input file 1, Prunella_fulvescens_BGI_coress.seqid:

PRUFUL_R14685   Prunella_himalayana_BGI_1
PRUFUL_R05501   Prunella_himalayana_BGI_2
PRUFUL_R10205   Prunella_himalayana_BGI_3
PRUFUL_R07295   Prunella_himalayana_BGI_4
PRUFUL_R07296   Prunella_himalayana_BGI_5
PRUFUL_R10726   Prunella_himalayana_BGI_6
PRUFUL_R13095   Prunella_himalayana_BGI_7
PRUFUL_R13096   Prunella_himalayana_BGI_8
PRUFUL_R14411   Prunella_himalayana_BGI_9
PRUFUL_R07297   Prunella_himalayana_BGI_10

Here is an example for input file 2, Prunella_fulvescens.homolog.gff:

scaffold9610    GeneWise    mRNA    732 962 54.88   -   .   ID=PRUFUL_R00001;Source=ENSTGUP00000017881-D17;Shift=0;MidStop=0;
scaffold9610    GeneWise    CDS 732 962 .   -   0   Parent=PRUFUL_R00001;Source=ENSTGUP00000017881-D17;
scaffold9610    GeneWise    mRNA    2503    2764    74.71   -   .   ID=PRUFUL_R00002;Source=ENSTGUP00000018017-D16;Shift=1;2653-2656;MidStop=0;
scaffold9610    GeneWise    CDS 2503    2652    .   -   0   Parent=PRUFUL_R00002;Source=ENSTGUP00000018017-D16;
scaffold9610    GeneWise    CDS 2657    2764    .   -   0   Parent=PRUFUL_R00002;Source=ENSTGUP00000018017-D16;
scaffold9610    GeneWise    mRNA    2081    2496    63.16   -   .   ID=PRUFUL_R00003;Source=ENSTGUP00000018035-D83;Shift=0;MidStop=0;
scaffold9610    GeneWise    CDS 2081    2195    .   -   1   Parent=PRUFUL_R00003;Source=ENSTGUP00000018035-D83;
scaffold9610    GeneWise    CDS 2246    2496    .   -   0   Parent=PRUFUL_R00003;Source=ENSTGUP00000018035-D83;

I would be very appreciated if you could help fixing this issue

Python • 633 views
ADD COMMENT
0
Entering edit mode

Impossible to answer without posting at least an example of the input file.

ADD REPLY
0
Entering edit mode

Thanks for the reply! I updated and posted the examples for input files~

ADD REPLY
0
Entering edit mode

Double check that tab is actually the field separator for the input file. List goes out of range because splitting by tab results in a single field (only index 0). Provided code has no problem at all.

ADD REPLY

Login before adding your answer.

Traffic: 2640 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6