Question: need help,please -->RNA-seq, STAR+RSEM wrong ,
1
gravatar for dq18
4 weeks ago by
dq180
dq180 wrote:

I use STAR +RSEM to align and quantify my RNA-seq raw data, while I got this after useing rsem-calculate-expression:


/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-calculate-expression --no-bam-output \
> --alignments -p 5 \
> -q /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam \
> /gpfs/share/home/1701110265/morus/04rsem/rsem \
> /gpfs/share/home/1701110265/morus/04rsem/test1

rsem-parse-alignments /gpfs/share/home/1701110265/morus/04rsem/rsem /gpfs/share/home/1701110265/morus/04rsem/test1.temp/test1 /gpfs/share/home/1701110265/morus/04rsem/test1.stat/test1 /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam 1 -tag XM -q
The SAM/BAM file declares more reference sequences (31302) than RSEM knows (29334)!
"rsem-parse-alignments /gpfs/share/home/1701110265/morus/04rsem/rsem /gpfs/share/home/1701110265/morus/04rsem/test1.temp/test1 /gpfs/share/home/1701110265/morus/04rsem/test1.stat/test1 /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam 1 -tag XM -q" failed! Plase check if you provide correct parameters/options for the pipeline!

I cannot find what is wrong with my code, here are the details: STAR-2.7.0e RSEM-1.3.1 the genome I use has 30301 scaffolds and gff3 file is used, I download them both in NCBI raw data is sequencing by illumina platform and they are SE (single end 75bp) reads .


the STAR codes are:

/gpfs/share/home/1701110265/soft/STAR-2.7.0e/bin/Linux_x86_64/STAR --runThreadN 6 --runMode genomeGenerate \
--genomeDir /gpfs/share/home/1701110265/morus/00ref/star.genome.test02/ \
--genomeFastaFiles /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.fa \
--genomeChrBinNbits 15 \
--sjdbGTFtagExonParentTranscript Parent \
--sjdbGTFfile /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.gff \
--sjdbOverhang 74


 /gpfs/share/home/1701110265/soft/STAR-2.7.0e/bin/Linux_x86_64/STAR --runThreadN 5 --genomeDir /gpfs/share/home/1701110265/morus/00ref/star.genome.index \
--readFilesIn /gpfs/share/home/1701110265/morus/02clean_data/test/1h_AGTCAA_L002_R1_001.clean.fastq \
--outFileNamePrefix /gpfs/share/home/1701110265/morus/03align_out/test4/1h_ \
--outSAMtype BAM SortedByCoordinate \
--outBAMsortingThreadN 5 \
--quantMode TranscriptomeSAM GeneCounts

the RSEM codes are:

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-prepare-reference --gff3 /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.gff \
/gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.fa \
/gpfs/share/home/1701110265/morus/04rsem/rsem

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/convert-sam-for-rsem /gpfs/share/home/1701110265/morus/03align_out/test3/1h_Aligned.toTranscriptome.out.bam /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-calculate-expression --no-bam-output \
--alignments -p 5 \
-q /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam \
/gpfs/share/home/1701110265/morus/04rsem/rsem \
/gpfs/share/home/1701110265/morus/04rsem/test1

please help me. please


rna-seq • 102 views
ADD COMMENTlink modified 4 weeks ago by genomax65k • written 4 weeks ago by dq180

It seems RSEM is performing different filtering on the GFF3 file than STAR before making the transcriptome fasta file. You might make a transcriptome fasta file yourself and use that with STAR/RSEM, since then you know it will match.

ADD REPLYlink written 4 weeks ago by Devon Ryan89k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 644 users visited in the last hour