Hi,
I want to get the TPM files from aligned files generate with STAR and reading I found out that the easiest way is using RSEM
or Salmon
. My code for the alignment is
/Users/c/STAR/bin/MacOSX_x86_64/STAR runThreadN 4 --genomeDir /Users/c/Desktop/Human_genome_index --readFilesIn /Users/c/Desktop/test/C1D20_R1_001_paired.fastq /Users/c/Desktop/test/C1D20_R2_001_paired.fastq --quantMode TranscriptomeSAM GeneCounts --outFileNamePrefix C1D20 --outSAMtype BAM SortedByCoordinate
so to my understanding I could use the Aligned.toTranscriptome.out.bam
to get the TPM from RSEM. So I create the reference genome doing:
/Users/c/Downloads/RSEM-1.2.25/rsem-prepare-reference /Users/c/Desktop/test human_genomeRSEM
But now I am confused about a parameter for RSEM. I know I have to use rsem-calculate-expression [options] --alignments [--paired-end] input reference_name sample_name
but it doesn't run if I write:
/Users/c/Downloads/RSEM-1.2.25/rsem-calculate-expression --/Users/c/Desktop/aligned_file_transcriptome/C1D20Aligned.toTranscriptome.out.bam /Users/c/Desktop/rsem/human_genomeRSEM.transcripts.fa CD120
What is wrong in my code? I have seen in the examples of the manual that there is the possibility to re-run RSEM with STAR alignment but I don't wan to re-run if it is possible.
edit: I also tried to use: salmon quant -t transcripts.fa -l <LIBTYPE> -a aln.bam -o salmon_quant
with the following code:
salmon quant -t /Users/c/Desktop/test/GRCh38.primary_assembly.genome.fa -l A -a /Users/c/Desktop/aligned_file_transcriptome/C1D20Aligned.toTranscriptome.out.bam -o salmon_quant
and it give me this error:
[2023-09-30 09:48:16.228] [jointLog] [critical] Transcript ENST00000504270.3 appeared in the BAM header, but was not in the provided FASTA file
[2023-09-30 09:48:16.228] [jointLog] [critical] Please provide a reference FASTA file that includes all targets present in the BAM header
If you have access to the genome FASTA and GTF used for alignment
consider generating a transcriptome fasta using a command like:
gffread -w output.fa -g genome.fa genome.gtf
you can find the gffread utility at (http://ccb.jhu.edu/software/stringtie/gff.shtml)
Thanks
Camilla