Entering edit mode
4.3 years ago
d.w.karjosukarso
•
0
Hi, I would like to do CNV analysis on bulk RNA-seq data from cancer cell lines. I made the correlation file I used the following code:
/home/karjosukarso/anaconda3/envs/CNV/bin/cnvkit.py import-rna -f counts -g /home/karjosukarso/CNVkit/ensembl-gene-info.hg38.tsv -c /home/karjosukarso/DK-53/CNV/tcga-hnsc.cnv-expr-corr.tsv -o output.txt *.txt
But, I got the following error:
Dropping 369445 / 369445 rarely expressed genes from input samples
Loading gene metadata and TCGA gene expression/CNV profiles
Loaded /home/karjosukarso/CNVkit/ensembl-gene-info.hg38.tsv with shape: (221323, 9)
Loaded /home/karjosukarso/DK-53/CNV/tcga-hnsc.cnv-expr-corr.tsv with shape: (20309, 4)
Resetting 2844 ambiguous genes' correlation coefficients to default 0.100000
Trimmed gene info table to shape: (63966, 13)
Aligning gene info to sample gene counts
Weighting genes with below-average read counts
/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py:269: FutureWarning: clip_upper(threshold) is deprecated, use clip(upper=threshold) instead
weights = [np.sqrt((gene_counts / gene_counts.quantile(.75)).clip_upper(1))]
Calculating normalized gene read depths
Traceback (most recent call last):
File "/home/karjosukarso/anaconda3/envs/CNV/bin/cnvkit.py", line 13, in <module>
args.func(args)
File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/commands.py", line 1462, in _cmd_import_rna
args.normal, args.do_gc, args.do_txlen, args.max_log2)
File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/import_rna.py", line 39, in do_import_rna
gene_info, sample_counts, tx_lengths, normal_ids)
File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py", line 274, in align_gene_info_to_samples
normal_ids)
File "/home/karjosukarso/anaconda3/envs/CNV/lib/python2.7/site-packages/cnvlib/rna.py", line 310, in normalize_read_depths
assert sample_depths.values.sum() > 0
AssertionError
Could anyone help me what could have gone wrong? Thank you.
Greetings, Dyah
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!
Is the software dropping all genes from input samples?
I am having a similar issue as the OP, and indeed it seems that it is dropping all genes from input sample. Since you asked whether this was the case, do you have suggestions for how to make the input files be correctly read?
My input text files are simple two-column "\t" seperated files such as "ENSG00000000003 64"
Thank you!
EDIT: I tried to convert to the RSEM format and now it worked. Seems there are still issues with simple gene counts input.
Hi, Did you find the answer for your problem? I have tried the cnvkit import-rna, similar problem, I have the kallisto 2-column file. I have tested the example TCGA file, no problem. But with my own kallisto input file, there is the wrong message I got, Can anyone please give me some suggestions? Thanks.
HY
My code: cnvkit.py import-rna --gene-resource /usr/local/apps/cnvkit/0.9.7.b1/data/ensembl-gene-info.hg38.tsv \ --correlations /usr/local/apps/cnvkit/0.9.7.b1/data/tcga-skcm.cnv-expr-corr.tsv \ --output out-summary.tsv --output-dir /data/$USER/out/ *.txt
Wrong message:
Dropping 76138 / 190522 rarely expressed genes from input samples Loading gene metadata and TCGA gene expression/CNV profiles Loaded /usr/local/apps/cnvkit/0.9.7.b1/data/ensembl-gene-info.hg38.tsv with shape: (221323, 9) Loaded /usr/local/apps/cnvkit/0.9.7.b1/data/tcga-skcm.cnv-expr-corr.tsv with shape: (19177, 4) Resetting 2846 ambiguous genes' correlation coefficients to default 0.100000 Trimmed gene info table to shape: (63966, 13) Aligning gene info to sample gene counts Weighting genes with below-average read counts /usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py:267: FutureWarning: clip_upper(threshold) is deprecated, use clip(upper=threshold) instead weights = [np.sqrt((gene_counts / gene_counts.quantile(.75)).clip_upper(1))] Calculating normalized gene read depths Traceback (most recent call last): File "/usr/local/apps/cnvkit/0.9.7.b1/bin/cnvkit.py", line 9, in <module> args.func(args) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/commands.py", line 1535, in _cmd_import_rna args.normal, args.do_gc, args.do_txlen, args.max_log2) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/import_rna.py", line 39, in do_import_rna gene_info, sample_counts, tx_lengths, normal_ids) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py", line 272, in align_gene_info_to_samples normal_ids) File "/usr/local/Anaconda/envs_app/cnvkit/0.9.7.b1/lib/python3.6/site-packages/cnvlib/rna.py", line 308, in normalize_read_depths assert sample_depths.values.sum() > 0 AssertionError