1
0
Entering edit mode
2.7 years ago
wangdp123 ▴ 250

Hi there,

I am using STAR+cufflink combination to handle the unstranded paired-end RNA-Seq datasets.

1. I wonder if there are a set of typical parameters for both STAR and cufflink?

For STAR:

STAR --runThreadN 1 --runMode alignReads --genomeDir index --readFilesIn sample_r1.fq sample_r2.fq --outFileNamePrefix sample_ --outSAMtype BAM SortedByCoordinate --outSAMattributes All --outSAMstrandField intronMotif


cufflinks -p 1 -G sample.gtf -o sample_clout sample_Aligned.sortedByCoord.out.bam


By running the above two command lines, I encountered a warning message:

"Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided."

Does this warning matter? Is anything gone wrong?

1. I noticed from the STAR manual that for unstranded RNA-Seq data, we should give the parameter "--outSAMstrandField intronMotif" to STAR. And I tested the following three scenarios:

(1) without --outSAMattributes without --outSAMstrandField, the error message from Cufflinks: BAM record error: found spliced alignment without XS attribute (1) --outSAMattributes All --outSAMstrandField intronMotif, no error message, I can see the XS attribute in the BAM file. (1) --outSAMattributes Standard --outSAMstrandField intronMotif, no error message, I can see that there is NO XS attribute in the BAM file.

Does this mean using either "--outSAMattributes All" and "--outSAMattributes Standard" will get to the same destination? Does Cufflinks treat them in the same way?

Tom

RNA-Seq STAR Cufflinks • 3.0k views
0
Entering edit mode
2.7 years ago

The error has nothing to do with the strandedness of the data.

It is about the estimated fragment sizes for the read pairs. The 9th column of the BAM file contains the TLEN field, (template lenght). The --frag-len-mean and --frag-len-std-dev is asking for the mean value of TLEN and its standard deviation. I am not sure why there seem to be insufficient data there.

Either ignore the warning or figure out the mean and stdev for your data from that column.