Question

Salmon in alignment mode - what do they mean in the documentation, when they say to use transcriptome.fa?

1

Entering edit mode

2.2 years ago

Shraddha ▴ 90

Hello all,

I have a dataset with 40 samples, which I aligned to a reference transcriptome using HISAT2. I then quantified these using salmon using the alignment mode, with a command like this:

salmon quant -t referencetranscriptome.fa -l U -a alignment.bam -o path/to/out/dir/

Here referencetranscriptome.fa is constant for all 40 samples, and the same reference I indexed for use with HISAT. Upon rereading the documentation, I understand that this is wrong?

Say that you’ve prepared your alignments using your favorite aligner and the results are in the file aln.bam, and assume that the sequence of the transcriptome you want to quantify is in the file transcripts.fa.

This makes it seem like for each run of salmon in alignment mode, I should use a different transcriptome. For instance, something like this:

salmon quant -t transcriptome1A.fa -l U -a 1A.bam -o /path/to/out/dir

So question 1: Is this the right approach?

When I noticed this, I tried to run the above command, which gave me an error like this:

Please provide a reference FASTA file that includes all targets present in the BAM header If you have access to the genome FASTA and GTF used for alignment consider generating a transcriptome fasta using a command like: gffread -w output.fa -g genome.fa genome.gtf

A previous question (here) with the same error had the issue that the references were difference (genome and transcriptome), which is not the case for me.

Additionally, I tried running gffread as suggested by the salmon error, which doesn't work at all because it wants to pull gene names (?) from my gff file, but my first column has chromosome numbers, while my gene names are in the 9th column. The command was:

gffread -w transcript1A.fa -g referencetranscriptome.fa referenceannotation.gff

Thus question 2 : Is gffread the right step here? If so, what should I be using to have it run properly?

And question 3: Is there better documentation or a tutorial on salmon alignment mode? I find their own documentation somewhat confusing, and would appreciate another resource.

Thanks in advance!

alignment salmon transcriptome • 621 views

ADD COMMENT • link 2.2 years ago by Shraddha ▴ 90