Question: How to map RNA-seq with hisat2, stringtie, an assembly fasta, a gtf file, and a transcript fasta?
gravatar for O.rka
6 months ago by
O.rka110 wrote:

I have the following files: assembly.fa transcripts.fa annotation.gtf

My organism is eukaryotic with introns so I was to use the hisat2 -> stringtie pipeline.

The example in the below link looks like it maps to the chromosomes/assembly and not the transcripts with HISAT2. If there were introns separating 2 exons then wouldn't the mapping be partial and the best way would be to map to the transcripts?

Does anyone have a way to pipe hisat2 directly into stringtie? I know I'm supposed to use the --dtf flag in HISAT2 but I haven't figured out if I'm mapping to the transcripts or the assembly?

ADD COMMENTlink modified 6 months ago by swbarnes26.0k • written 6 months ago by O.rka110
gravatar for swbarnes2
6 months ago by
United States
swbarnes26.0k wrote:

HISAT is a splice-aware aligner. You align to genomes, and it is smart enough to know that many reads will align with large gaps. If you look at the index generating step, it is making the index with the guidance of a gtf with genomic features annotated by genomic coordinates.

ADD COMMENTlink written 6 months ago by swbarnes26.0k

Thanks, this is really helpful. So I would do:

Set up the exons/splice sites annotation.gtf > splicesites.tsv annotation.gtf > exons.tsv

Build index

hisat2-build --ss ./splicesites.tsv --exon ./exons.tsv assembly.fa organism_A

hisat2 -> samtools -> stringtie

hisat2 --dta -x ./assembly -1 ./reads/R1.fastq -2 ./reads/R2.fastq | samtools view -Su | samtools sort - | stringtie -G annotation.gtf -A sample_counts.tsv

Does the above command look correct?

ADD REPLYlink modified 6 months ago • written 6 months ago by O.rka110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 699 users visited in the last hour