Hi all,
I have a stranded (fr-firststrand) paired-end RNA-seq data. I did mapping with tophat and STAR. Since, STAR gives me better mapping (93%) of the data over tophat (86%), I want to use the STAR generated BAM file with cufflinks for novel and antisense genes.
command used:
for generating XS:A fields were set in STAR:
STAR --genomeDir Mapping_tools/STAR/v9.0/ --readFilesIn R1.fq.gz R2.fq.gz --readFilesCommand zcat --sjdbGTFfile Genome/v9.0/pf9.0.gtf --sjdbGTFfeatureExon exon --sjdbGTFtagExonParentTranscript gene_id --outFilterMismatchNmax 5 --runThreadN 8 --outSAMattributes All --outSAMstrandField intronMotif --outFileNamePrefix Ring1_STAR --outSAMmode Full --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts TranscriptomeSAM --outSAMunmapped Within
tophat -r 273 -p 16 --no-discordant --no-mixed --library-type fr-firststrand -G Genome/v9.0/pf9.0.gtf -o Ring1_tophat
Genome/index/Plasmodium/Pf3d7_v9_bow R1.fq.gz R2.fq.gzcufflinks -o tophat_cufflinks -g Genome/v9.0/pf9.0.gtf --library-type fr-firststrand -p 16 accepted_hits.bam cufflinks -o STAR_cufflinks -g Genome/v9.0/pf9.0.gtf --library-type fr-firststrand -p 16 Ring1_STARAligned.sortedByCoord.out.bam
When using STAR BAM, cufflinks cannot detect paired-end reads and raises a warning: Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided. The genes.fpm.tracking file contains less number of genes.
But When I use tophat BAM file, cufflinks does not raise such warning and deduces fragment length from the data, and the output genes.fpm.tracking file has all my genes listed.
Has anyone faced this issue? What can I do let cufflinks detect the paired-end data?
Thanks,
Aarthi
Could you paste an alignment record from both BAM files so we can see exactly how they differ? Ideally a pair which maps exactly the same in STAR vs Tophat, but if thats too much of a pain to wrangle out of the data, just a record from both should be enough to cross-off a bunch of potential reasons.