Question

Bug in combine.py for RNA-Seq analysis

0

Entering edit mode

3.2 years ago

rvjensen ▴ 10

Strange bug in the combine.py utility for combining kallisto output from multiple samples:

Works well for combining the data from multiple samples but the target_id's in the first column of the output count.txt file are not in the same order as the target_id's in the input abundance.tsv file.

Any suggestions would be appreciated.

Example:

count.txt

target_id length eff_length X1 X2 X3 BW12616D

chr12:121593689-121594058 369 170 0 0 0 188

chr1:84971674-84972004 330 131 0 0 0 101

chr1:226074938-226075857 919 720 0 0 0 688.182

chr10:55664705-55665110 405 206 0 0 0 148 ...

abundance.tsv

target_id length eff_length est_counts tpm

chr1:65409-65725 316 117 75.5 4.08319

chr1:65731-66073 342 143 104.764 4.6357

chr1:69381-69700 319 120 322.447 17.0026 ...

software error • 483 views

ADD COMMENT • link 3.2 years ago by rvjensen ▴ 10

score 1 · Answer 1 · 2021-02-02

1

Entering edit mode

3.2 years ago

rvjensen ▴ 10

No bug! It turns out that dictionaries are correctly ordered in Python 3. I was using the old Mac default Python 2.7 (Sigh).

ADD COMMENT • link 3.2 years ago by rvjensen ▴ 10