How to map RNA-seq with hisat2, stringtie, an assembly fasta, a gtf file, and a transcript fasta?
1
0
Entering edit mode
5.3 years ago
O.rka ▴ 710

I have the following files: assembly.fa transcripts.fa annotation.gtf

My organism is eukaryotic with introns so I was to use the hisat2 -> stringtie pipeline.

The example in the below link looks like it maps to the chromosomes/assembly and not the transcripts with HISAT2. If there were introns separating 2 exons then wouldn't the mapping be partial and the best way would be to map to the transcripts? https://davetang.org/muse/2017/10/25/getting-started-hisat-stringtie-ballgown/

Does anyone have a way to pipe hisat2 directly into stringtie? I know I'm supposed to use the --dtf flag in HISAT2 but I haven't figured out if I'm mapping to the transcripts or the assembly?

hisat2 mapping transcriptome stringtie • 3.2k views
ADD COMMENT
2
Entering edit mode
5.3 years ago

HISAT is a splice-aware aligner. You align to genomes, and it is smart enough to know that many reads will align with large gaps. If you look at the index generating step, it is making the index with the guidance of a gtf with genomic features annotated by genomic coordinates.

ADD COMMENT
0
Entering edit mode

Thanks, this is really helpful. So I would do:

Set up the exons/splice sites

hisat2_extract_splice_sites.py annotation.gtf > splicesites.tsv hisat2_extract_exons.py annotation.gtf > exons.tsv

Build index

hisat2-build --ss ./splicesites.tsv --exon ./exons.tsv assembly.fa organism_A

hisat2 -> samtools -> stringtie

hisat2 --dta -x ./assembly -1 ./reads/R1.fastq -2 ./reads/R2.fastq | samtools view -Su | samtools sort - | stringtie -G annotation.gtf -A sample_counts.tsv

Does the above command look correct?

ADD REPLY

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6