error in identifying alternativce splcing using splAdder
1
0
Entering edit mode
4.8 years ago

Hello.I am trying to identify alternative splicing using splAdder. I have annotation GTF file. I have three replicates of barley both for control and treated respectively i.e c1, c2,c3 and t1, t2, t3. I have generated sorted bam and .bai files and put them in a working directory. When I ran splAdder,

python2.7 spladder.py -a genome_annotationfile.gtf -b c1Aligned.sorted,c2Aligned.sorted,c3Aligned.sorted,t1Aligned.sorted,t2Aligned.sorted,t3Aligned.sorted -o splAdder_result


it shows following information:

/home/aasim/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.

from ._conv import register_converters as _register_converters
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000010.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000020.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000060.1 - information has been inferred from tags
WARNING: barley_genome_annotationfile.gtf does not have gene level information for transcript HORVU1Hr1G000080.1 - information has been inferred from tags
WARNING: too many warnings for inferred tags

WARNING: a total of 39734 cases had no gene level information annotated - information has been inferred from tags
WARNING: removing 6046 genes from given annotation that overlap to each other:
list of excluded genes written to: barley_genome_annotationfile.gtf.genes_excluded_gene_overlap
WARNING: removing 2 genes from given annotation that share exact exon coordines:
list of excluded exons written to: barley_genome_annotationfile.gtf.genes_excluded_exon_shared
Augmenting splice graphs.
=========================
Generating splice graph ...
...done.

Traceback (most recent call last):
File "spladder.py", line 322, in <module>
genes = gen_graphs(genes, CFG)
File "/home/aasim/diksha/sam_files_cd_rt_/modules/core/gen_graphs.py", line 83, in gen_graphs
introns = get_intron_list(genes, CFG)
File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 431, in get_intron_list
(introns, spliced_coverage) = get_all_data(blocks[b], filenames, mapped=False, filter=filter, var_aware=var_aware, primary_only=primary_only, no_mm=no_mm)
File "/home/aasim/diksha/sam_files_cd_rt_/modules/reads.py", line 314, in get_all_data
(coverage_tmp, introns_tmp) = get_reads(fname, contig_name, block.start, block.stop, strand, filter, mapped, spliced, var_aware, collapse, primary_only, no_mm)
return filter['mismatch'] < tags['NM']
KeyError: 'NM'


Where is the problem?

1
Entering edit mode

Whenever you see such error, the best way to start looking (when you are not much familiar with Python) is to google "Python KeyError"; you will surely have some idea, where to look into.

PS: There are a couple of warnings as well. GTF files are boring to work with ;)

1
Entering edit mode
4.8 years ago
blawney ▴ 10

Note that I have not used this particular software, but at a quick glance, it appears your BAM files may not have the "NM" tag (edit distance to the reference, according to SAM spec) in the optional alignment fields (right-most column).

The KeyError exception is raised when you try to access a missing key in a Python dictionary, so it seems likely that the parser (pysam?) is not finding that tag in the alignments.

0
Entering edit mode

I believe you are right.