Question: Alignment with multiple reference genome using HISAT2
0
gravatar for spriyansh29
8 weeks ago by
spriyansh2930
spriyansh2930 wrote:

I have 3 reference genomes and I wish to align my FASTQ reads against all 3 of them. I have used hisat2-build to build individual indexes of all 3 of them, but couldn't find the command to make an index of multiple genomes.

I have run the following command for alignment -

hisat2 -p 4 individual_index --dta --rna-strandness RF -1 paired_1.fastq -2 paired_2.fastq -S aligned.sam

I want to run alignment with all 3 indexes in one go with HISAT2. Also, I cannot use STAR as I am using an 8 gig ram system.

ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by spriyansh2930
1
gravatar for genomax
8 weeks ago by
genomax85k
United States
genomax85k wrote:

I want to run alignment with all 3 indexes in one go with HISAT2

I don't think you can do that. You could cat the three genomes and make a single giant reference to index but then you may run into the 8G RAM limit on your hardware. This may be one of those instances where finding better hardware is the answer for your requirement.

Note: You could use bbsplit.sh from BBMap suite to align against multiple genomes at the same time but depending on genomes you have 8G may not be enough.

ADD COMMENTlink written 8 weeks ago by genomax85k

well, I think it will work, as all the 3 genomes are viral so they would take less space. I will give it a try and update!

ADD REPLYlink written 8 weeks ago by spriyansh2930

Viral genomes should be no problem. Look into bbsplit.sh since it has some nice options about how you want to handle reads that multi-map, within and across these genomes. They are going to be much better than hisat2.

ADD REPLYlink written 8 weeks ago by genomax85k

I am actually new to NGS analysis, and that "cat technique worked well for me. I wanna know that while extracting the read counts from the .bam files how should I proceed with the ht-seq count? 1) Should I extract read counts using 3 different GFF/GTF files (3 for 3 genomes as used in alignment) and then merge them? 2) Or should I just append all 3 GTF/GFF into one large file and then proceed?

ADD REPLYlink written 8 weeks ago by spriyansh2930
1

Assuming your fasta headers are unique you may be able to use 3 passes of counting with three GTF files. Try it out first.

ADD REPLYlink written 8 weeks ago by genomax85k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1339 users visited in the last hour