Question: HISAT2 command help
2
gravatar for dina.hesham139
4.8 years ago by
Egypt
dina.hesham139120 wrote:

Hey,

It's my first time to use HISAT2 for alignment and the manual is full of parameters that got me confused.

I need  to write down a command that would include the following:

map against hg19, my samples are paired-end, I need to have xs attributes to the output, and I need the output to be compatible with Stringtie for downstream analysis.

Have anyone worked on this pipeline before and can help me ?

Thank you!

rna-seq alignment • 15k views
ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by dina.hesham139120

For the --known-splicesite-infile, it was stated in the manual:

You can create such a list using python extract_splice_sites.py genes.gtf > splicesites.txt, where extract_splice_sites.py is included in the HISAT2 package, genes.gtf is a gene annotation file, and splicesites.txt is a list of splice sites with which you provide HISAT2 in this mode. Note that it is better to use indexes built using annotated transcripts (such as genome_tran or genome_snp_tran), which works better than using this option.

Is using genome_tran index would substitute the genome+ the annotation gtf file parameters in Tophat?

ADD REPLYlink modified 11 months ago by _r_am30k • written 4.8 years ago by dina.hesham139120

Yeah, if you want to download the prebuilt indices then just get genome_tran and call it done.

ADD REPLYlink written 4.8 years ago by Devon Ryan97k

Thank you!!!

ADD REPLYlink written 4.8 years ago by dina.hesham139120

One more question, for alignment, what would you generally recommend STAR or HISAT2?

ADD REPLYlink written 4.8 years ago by dina.hesham139120

I generally prefer STAR, though it requires significantly more RAM.

ADD REPLYlink written 4.8 years ago by Devon Ryan97k
6
gravatar for Devon Ryan
4.8 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:
hisat2 -x /path/to/hg19/indices -1 sample_1.fq.gz -2 sample_2.fq.gz | samtools view -Sbo sample.bam -

The resulting BAM file should work with stringTie or cufflinks. You probably want the --known-splicesite-infile option though.

Edit: You probably need to sort and index the BAM file. At least cufflinks would require that.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Devon Ryan97k

Hi Devon, I work with dog genome. If I don't have premade indexes such as genome_tran or genome_snp_tran, should I go ahead to put my gtf file in the --known-splicesite-infile option or should I no using any gtf at all?

Thanks!

ADD REPLYlink written 4.6 years ago by CandiceChuDVM2.1k

I think the GTF needs to be preprocessed first (there's a script for that that comes with hisat2), but aside from that yes.

ADD REPLYlink written 4.6 years ago by Devon Ryan97k

Thanks for your suggestion! I will process it first with extract_splice_sites.py.

ADD REPLYlink written 4.6 years ago by CandiceChuDVM2.1k

Hi Devon, could you tell me what is the meaning of the '-' symbol at the end of your samtools view command above (after 'sample.bam')? I am running a similar code without the '-', and sometimes it works fine but other times the program ends with a parsing error.. thanks!

ADD REPLYlink written 4.6 years ago by fshimizu0
1

- means "read from the pipe (|)" in this case. This is particular to samtools. If you're running samtools on a file then you would never use -.

ADD REPLYlink written 4.6 years ago by Devon Ryan97k

hi Devon, what's the benefit of including --known-splicesite-infile option in the command line? if i'm only doing differential gene expression analysis, would using or omitting --known-splicesite-infile option make much difference?

ADD REPLYlink written 4.3 years ago by epigene490
1

You'll likely get slightly better alignments by using that and therefore slightly higher counts for DE testing.

ADD REPLYlink written 4.3 years ago by Devon Ryan97k

thanks for the info! could you also comment on -k option, would you recommend using the default (5)? If there is other options I should use, I'd love to hear your thoughts on them. Thanks.

ADD REPLYlink written 4.3 years ago by epigene490

The defaults should be fine. The only other thing to change is the number of threads used.

ADD REPLYlink written 4.3 years ago by Devon Ryan97k

How do I align paired end files , would it be the same command "-1 sample_1.fq.gz -2 sample_2.fq.gz" I have to give -1 and -2 ?

ADD REPLYlink written 3.7 years ago by krushnach80850
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1502 users visited in the last hour