When I try to run cufflinks, with the command:
cufflinks --GTF /.../B0510_manual_reindexed_v2.gff --min-isoform-fraction 0.5 --pre-mrna-fraction 0.05 --max-intron-length 2000 --small-anchor-fraction 0.06 --min-intron-length 30 --overlap-radius 1 --3-overhang-tolerance 0 --intron-overhang-tolerance 0 --no-faux-reads -p 8 -o /.../cufflinks_out_V3/Apo12B/ /media/cinerea/BGI_RNAseq_V2/.../Apo12B/accepted_hits.bam
Cufflinks just skips a huge part (+- 3.4Mb) of a scaffold, at the following step:
You are using Cufflinks v2.1.1, which is the most recent release. [14:00:50] Loading reference annotation. [14:00:50] Inspecting reads and determining fragment length distribution. Processing Locus B0510_5C01:490546-492362 [ ] 0%
I tried to tweak the parameter --max-bundle-frags up and down, but this does not make any difference. In isoforms.fpkm_tracking the transcripts are marked with HIDATA. The reads seem fine at this locus.
What is wrong? any ideas?
EDIT: I inspected the -verbose logs, and I see that exactly this part that's being skipped, is taken by cufflinks as one big bundle, with 1M reads on it. I lowered the --max-bundle-length flag, but this does not seem to have any effect at all?
EDIT2: It filters the large bundle after the "processing-step" resulting in no outcome at all for the genes in that locus. Where does cufflinks get it's bundle sizes from? Can I adjust this?