How to subset a UCSC table
1
0
Entering edit mode
6.9 years ago
ddzhangzz ▴ 90

I downloaded ensGene.txt file from UCSC here and wanted to create custom GFF annotation for MISO. I can generate the GFF annotation using the entire ensGene.txt file but I am only interested to generate for a small subset. Then I create a subset by grep such like:

$grep ENSG00000141510 ensGene.txt > ensGens_sub.txt

But I got error when I rerun the rnaseqlib program:

    Making GFF alternative events annotation...
  - UCSC tables read from: /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local
  - Output dir: /data/projects/JingChen-TCGA/Alternative_splicing/Data/miso_annotation/tp53
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/rnaseqlib/gff/gff_make_annotation.py", line 65, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/rnaseqlib/gff/gff_make_annotation.py", line 61, in main
    make_annotation(args)
  File "/usr/local/lib/python2.7/dist-packages/rnaseqlib/gff/gff_make_annotation.py", line 29, in make_annotation
    raise Exception, "No UCSC tables found in %s." %(tables_dir)
Exception: No UCSC tables found in /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local.

It says "No UCSC tables found in /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local.".

Does anybody know the reasons and how to correctly subset the UCSC table like ensGene.txt?

Assembly • 1.8k views
ADD COMMENT
0
Entering edit mode

Did you make sure the table you are using as input for your program is present at /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local ?

do following and check

ls /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local
ADD REPLY
0
Entering edit mode
$ ll /data/projects/JingChen-TCGA/Alternative_splicing/Data/UCSC/local
total 8
-rw-r--r-- 1 3849 Jun  8 14:20 TP53-ensGene.txt
-rw-r--r-- 1 4015 Jun  8 11:21 TP53-Homo_sapiens_Transcript_Summary.txt
ADD REPLY
2
Entering edit mode
6.9 years ago
venu 7.1k

rename TP53-ensGene.txt to ensGene.txt and try one more time. As I understand from the MISO link you provided it seems the program is searching for ensGene.txt by default.

ADD COMMENT
0
Entering edit mode

Thanks! It works. Never thought it was caused by name...:(.

ADD REPLY
0
Entering edit mode

Great. I'm moving my comment to answer. Please feel free to mark it accepted so the question will be closed and future users will find correct answer instead of searching in comments.

ADD REPLY

Login before adding your answer.

Traffic: 3298 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6