Question: HISAT2 index generation
0
gravatar for deshpande.neha2
5 months ago by
deshpande.neha20 wrote:

I made a fasta file combining the genome annotations from 2 organisms(S.cerevisiae and S.Pombe). I want to use this file as a reference to align my RNAseq reads. How do I generate HISAT2 indexes for this file? I read the hisat2 manual and looked at a few blogs online but nothing seems to work. Where do I generate indexes? Is it possible to do it on a computer cluster like ada?

sequencing rna-seq alignment • 396 views
ADD COMMENTlink modified 5 months ago • written 5 months ago by deshpande.neha20

@ genomax We used S.Pombe as a spike in control while making library preps for S.Cerevisiae samples. I would need a composite index file to align my reads. As I wrote before, I have read the manual multiple times and find the instructions quite vague and non specific. That being said, I'm still a novice trying to learn things as I go along.

ADD REPLYlink written 5 months ago by deshpande.neha20
1

@neha: Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep threads logically organized.

It is a rather odd choice of using S. pombe as a spike-in since those two yeasts are relatively similar. You are likely going to have the problem of many reads multi-mapping (mapping to both genomes). For RNAseq data such reads are not counted by default.

To build the genome.fa file you could concatenate chromosome sequence of both yeast (make sure the fasta headers contain something to distinguish S. pombe from S. cerevisiae, e.g. both can't have chr1 in header, make them, chr1_pombe and chr1_cere, you get the idea).

cat chr1_pombe chr2_pombe ... chrN_pombe chr1_cere chr2_cere ... chrN_cere > genome.fa

You can then use the command below to create the genome index

hisat2-build genome.fa cere_pombe

Then you would use the cere_pombe name in your alignments.

ADD REPLYlink written 5 months ago by genomax63k

Could you explain in greater detail how you used a S. pombe spike-in control? Do you know the exact transcript composition and abundancies of the S. pombe spike-in? Did you add this spike-in to all your S. cerevisiae samples?

ADD REPLYlink written 5 months ago by h.mon24k
1
gravatar for genomax
5 months ago by
genomax63k
United States
genomax63k wrote:

Building HISAT2 index should be simple as hisat2-build genome.fa index_name. You can find detailed options (if you need them) on the manual page.

That said why are you building a composite index of two species? Does your sample have both genomes in it? Are you looking to separate the reads for the two?

ADD COMMENTlink modified 5 months ago • written 5 months ago by genomax63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1225 users visited in the last hour