I have a GFF file containing MCF-7 cell transcript data. There is also a fastq file if that can be helpful.
chr1 PacBio transcript 27567 29338 . - . gene_id "PB2015.1"; transcript_id "PB2015.1.1"; chr1 PacBio exon 27567 29338 . - . gene_id "PB2015.1"; transcript_id "PB2015.1.1";
These are the first two lines.
You can see that it identifies both transcripts and exons. I want to filter out any transcripts containing less than two exons for each transcript.
I have attempted doing this with UCSC Table Browser and Galaxy. They both end up throwing errors.
Galaxy Error: An error occurred with this dataset: Traceback (most recent call last): File "/cvmfs/main.galaxyproject.org/galaxy/tools/filters/gff/gff_filter_by_feature_count.py", line 182, in <module> __main__() File "/cvmfs/main.galaxyproject.org/galaxy/tools/filters/gff/gff_filter_by_feature_c
File "/cvmfs/main.galaxyproject.org/galaxy/lib/galaxy/datatypes/util/gff_util.py", line 191, in __next__ self.seed_interval = GenomicIntervalReader.next(self) RuntimeError: maximum recursion depth exceeded while calling a Python object
Filter 18: MCF7 hg19.gff
Using feature name exon
With following condition >1
Table browser doesn't list exon as a possible filter option when I upload this dataset as a custom track.
I am very new to this. Does anyone have any suggestions for me here? I can use R pretty fluently and I have a little bit of python ability. I also have bedtools set up but I don't know how to use it very well.
Please point me in the right direction!