Question: How to map RNA-seq with hisat2, stringtie, an assembly fasta, a gtf file, and a transcript fasta?
gravatar for O.rka
9 weeks ago by
O.rka80 wrote:

I have the following files: assembly.fa transcripts.fa annotation.gtf

My organism is eukaryotic with introns so I was to use the hisat2 -> stringtie pipeline.

The example in the below link looks like it maps to the chromosomes/assembly and not the transcripts with HISAT2. If there were introns separating 2 exons then wouldn't the mapping be partial and the best way would be to map to the transcripts?

Does anyone have a way to pipe hisat2 directly into stringtie? I know I'm supposed to use the --dtf flag in HISAT2 but I haven't figured out if I'm mapping to the transcripts or the assembly?

ADD COMMENTlink modified 9 weeks ago by swbarnes24.9k • written 9 weeks ago by O.rka80
gravatar for swbarnes2
9 weeks ago by
United States
swbarnes24.9k wrote:

HISAT is a splice-aware aligner. You align to genomes, and it is smart enough to know that many reads will align with large gaps. If you look at the index generating step, it is making the index with the guidance of a gtf with genomic features annotated by genomic coordinates.

ADD COMMENTlink written 9 weeks ago by swbarnes24.9k

Thanks, this is really helpful. So I would do:

Set up the exons/splice sites annotation.gtf > splicesites.tsv annotation.gtf > exons.tsv

Build index

hisat2-build --ss ./splicesites.tsv --exon ./exons.tsv assembly.fa organism_A

hisat2 -> samtools -> stringtie

hisat2 --dta -x ./assembly -1 ./reads/R1.fastq -2 ./reads/R2.fastq | samtools view -Su | samtools sort - | stringtie -G annotation.gtf -A sample_counts.tsv

Does the above command look correct?

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by O.rka80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2060 users visited in the last hour