need help,please -->RNA-seq, STAR+RSEM wrong ,
0
1
Entering edit mode
5.1 years ago
dq18 • 0

I use STAR +RSEM to align and quantify my RNA-seq raw data, while I got this after useing rsem-calculate-expression:


/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-calculate-expression --no-bam-output \
> --alignments -p 5 \
> -q /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam \
> /gpfs/share/home/1701110265/morus/04rsem/rsem \
> /gpfs/share/home/1701110265/morus/04rsem/test1

rsem-parse-alignments /gpfs/share/home/1701110265/morus/04rsem/rsem /gpfs/share/home/1701110265/morus/04rsem/test1.temp/test1 /gpfs/share/home/1701110265/morus/04rsem/test1.stat/test1 /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam 1 -tag XM -q
The SAM/BAM file declares more reference sequences (31302) than RSEM knows (29334)!
"rsem-parse-alignments /gpfs/share/home/1701110265/morus/04rsem/rsem /gpfs/share/home/1701110265/morus/04rsem/test1.temp/test1 /gpfs/share/home/1701110265/morus/04rsem/test1.stat/test1 /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam 1 -tag XM -q" failed! Plase check if you provide correct parameters/options for the pipeline!

I cannot find what is wrong with my code, here are the details: STAR-2.7.0e RSEM-1.3.1 the genome I use has 30301 scaffolds and gff3 file is used, I download them both in NCBI raw data is sequencing by illumina platform and they are SE (single end 75bp) reads .


the STAR codes are:

/gpfs/share/home/1701110265/soft/STAR-2.7.0e/bin/Linux_x86_64/STAR --runThreadN 6 --runMode genomeGenerate \
--genomeDir /gpfs/share/home/1701110265/morus/00ref/star.genome.test02/ \
--genomeFastaFiles /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.fa \
--genomeChrBinNbits 15 \
--sjdbGTFtagExonParentTranscript Parent \
--sjdbGTFfile /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.gff \
--sjdbOverhang 74


 /gpfs/share/home/1701110265/soft/STAR-2.7.0e/bin/Linux_x86_64/STAR --runThreadN 5 --genomeDir /gpfs/share/home/1701110265/morus/00ref/star.genome.index \
--readFilesIn /gpfs/share/home/1701110265/morus/02clean_data/test/1h_AGTCAA_L002_R1_001.clean.fastq \
--outFileNamePrefix /gpfs/share/home/1701110265/morus/03align_out/test4/1h_ \
--outSAMtype BAM SortedByCoordinate \
--outBAMsortingThreadN 5 \
--quantMode TranscriptomeSAM GeneCounts

the RSEM codes are:

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-prepare-reference --gff3 /gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.gff \
/gpfs/share/home/1701110265/morus/00ref/Morus.notabilis.genome.fa \
/gpfs/share/home/1701110265/morus/04rsem/rsem

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/convert-sam-for-rsem /gpfs/share/home/1701110265/morus/03align_out/test3/1h_Aligned.toTranscriptome.out.bam /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem

/gpfs/share/home/1701110265/soft/RSEM-1.3.1/rsem-calculate-expression --no-bam-output \
--alignments -p 5 \
-q /gpfs/share/home/1701110265/morus/04rsem/input_for_rsem.bam \
/gpfs/share/home/1701110265/morus/04rsem/rsem \
/gpfs/share/home/1701110265/morus/04rsem/test1

please help me. please


RNA-Seq • 2.1k views
ADD COMMENT
0
Entering edit mode

It seems RSEM is performing different filtering on the GFF3 file than STAR before making the transcriptome fasta file. You might make a transcriptome fasta file yourself and use that with STAR/RSEM, since then you know it will match.

ADD REPLY

Login before adding your answer.

Traffic: 3097 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6