Question

import-rna assertion error

0

Entering edit mode

4.3 years ago

d.w.karjosukarso • 0

Hi, I would like to do CNV analysis on bulk RNA-seq data from cancer cell lines. I made the correlation file I used the following code:

/home/karjosukarso/anaconda3/envs/CNV/bin/cnvkit.py import-rna -f counts -g /home/karjosukarso/CNVkit/ensembl-gene-info.hg38.tsv -c /home/karjosukarso/DK-53/CNV/tcga-hnsc.cnv-expr-corr.tsv -o output.txt *.txt

But, I got the following error:

Dropping 369445 / 369445 rarely expressed genes from input samples
Loading gene metadata and TCGA gene expression/CNV profiles
Loaded /home/karjosukarso/CNVkit/ensembl-gene-info.hg38.tsv with shape: (221323, 9)
Loaded /home/karjosukarso/DK-53/CNV/tcga-hnsc.cnv-expr-corr.tsv with shape: (20309, 4)
Resetting 2844 ambiguous genes' correlation coefficients to default 0.100000
Trimmed gene info table to shape: (63966, 13)
Aligning gene info to sample gene counts
Weighting genes with below-average read counts
/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py:269: FutureWarning: clip_upper(threshold) is deprecated, use clip(upper=threshold) instead
  weights = [np.sqrt((gene_counts / gene_counts.quantile(.75)).clip_upper(1))]
Calculating normalized gene read depths
Traceback (most recent call last):
  File "/home/karjosukarso/anaconda3/envs/CNV/bin/cnvkit.py", line 13, in <module>
    args.func(args)
  File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/commands.py", line 1462, in _cmd_import_rna
    args.normal, args.do_gc, args.do_txlen, args.max_log2)
  File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/import_rna.py", line 39, in do_import_rna
    gene_info, sample_counts, tx_lengths, normal_ids)
  File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py", line 274, in align_gene_info_to_samples
    normal_ids)
  File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py", line 310, in normalize_read_depths
    assert sample_depths.values.sum() > 0
AssertionError

Could anyone help me what could have gone wrong? Thank you.

Greetings, Dyah

rna-seq • 1.1k views

ADD COMMENT • link updated 4.3 years ago by GenoMax 141k • written 4.3 years ago by d.w.karjosukarso • 0

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY • link 4.3 years ago by GenoMax 141k

0

Entering edit mode

Dropping 369445 / 369445 rarely expressed genes from input samples

Is the software dropping all genes from input samples?

ADD REPLY • link 4.3 years ago by GenoMax 141k

0

Entering edit mode

I am having a similar issue as the OP, and indeed it seems that it is dropping all genes from input sample. Since you asked whether this was the case, do you have suggestions for how to make the input files be correctly read?

My input text files are simple two-column "\t" seperated files such as "ENSG00000000003 64"

Thank you!

EDIT: I tried to convert to the RSEM format and now it worked. Seems there are still issues with simple gene counts input.

ADD REPLY • link 4.2 years ago by terkild • 0

0

Entering edit mode

Hi, Did you find the answer for your problem? I have tried the cnvkit import-rna, similar problem, I have the kallisto 2-column file. I have tested the example TCGA file, no problem. But with my own kallisto input file, there is the wrong message I got, Can anyone please give me some suggestions? Thanks.

HY

My code: cnvkit.py import-rna --gene-resource /usr/local/apps/cnvkit/0.9.7.b1/data/ensembl-gene-info.hg38.tsv \ --correlations /usr/local/apps/cnvkit/0.9.7.b1/data/tcga-skcm.cnv-expr-corr.tsv \ --output out-summary.tsv --output-dir /data/$USER/out/ *.txt

Wrong message:

Dropping 76138 / 190522 rarely expressed genes from input samples Loading gene metadata and TCGA gene expression/CNV profiles Loaded /usr/local/apps/cnvkit/0.9.7.b1/data/ensembl-gene-info.hg38.tsv with shape: (221323, 9) Loaded /usr/local/apps/cnvkit/0.9.7.b1/data/tcga-skcm.cnv-expr-corr.tsv with shape: (19177, 4) Resetting 2846 ambiguous genes' correlation coefficients to default 0.100000 Trimmed gene info table to shape: (63966, 13) Aligning gene info to sample gene counts Weighting genes with below-average read counts /usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py:267: FutureWarning: clip_upper(threshold) is deprecated, use clip(upper=threshold) instead weights = [np.sqrt((gene_counts / gene_counts.quantile(.75)).clip_upper(1))] Calculating normalized gene read depths Traceback (most recent call last): File "/usr/local/apps/cnvkit/0.9.7.b1/bin/cnvkit.py", line 9, in <module> args.func(args) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/commands.py", line 1535, in _cmd_import_rna args.normal, args.do_gc, args.do_txlen, args.max_log2) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/import_rna.py", line 39, in do_import_rna gene_info, sample_counts, tx_lengths, normal_ids) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py", line 272, in align_gene_info_to_samples normal_ids) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py", line 308, in normalize_read_depths assert sample_depths.values.sum() > 0 AssertionError

ADD REPLY • link 3.7 years ago by lhaiyan3 ▴ 80