Question: Adding tag XS to HISAT2 sam files perior to StringTie
0
gravatar for Farbod
13 months ago by
Farbod3.3k
Toronto
Farbod3.3k wrote:

Dear Biostars, Hi

I have 6 sam files (3 for cond1 and 3 for cond2) produced from HISAT2 from mapping Hiseq2000 RNA-seq data to a newly released draft genome.

Now I want to use StringTie and then proceed for DEG analysis but in the StringTie manual it says: "

Every spliced read alignment (i.e. an alignment across at least one junction) in the input SAM file must contain the tag XS to indicate the genomic strand that produced the RNA from which the read was sequenced. Alignments produced by TopHat and HISAT2 (when run with --dta option) already include this tag, but if you use a different read mapper you should check that this XS tag is included for spliced alignments . "

I did not use "--dta" option but when I checked my sam files there is some XS tags in it (e.g: YS:i:0 YT:Z:CP XS:A:- NH:i:1"

Q: So, what must I do? map all reads to reference from beginning, using HISAT2 and --dta option or . . . ?

NOTE: my mapping script for each paired-end reads :

./hisat2 -p 6 -x ht2_base_salmon_genome -1 '/RNA_Seq_Data/C1_clean_left.fq' -2 '/RNA_Seq_Data/C1_clean_right.fq' -S '/RNA_Seq_Data/C1.sam' &> C1.sam.info"

ADD COMMENTlink modified 13 months ago by lakhujanivijay4.3k • written 13 months ago by Farbod3.3k
1
gravatar for lakhujanivijay
13 months ago by
lakhujanivijay4.3k
India
lakhujanivijay4.3k wrote:

Hi

dta stands for downstream-transcriptome-assembly. Using this options means that you process your alignments to be compatible with transcript assemblers. With this option, HISAT2 requires longer anchor lengths for de novo discovery of splice sites. This leads to fewer alignments with short-anchors, which helps transcript assemblers improve significantly in computation and memory usage.

String-tie issues below warning at this link.

NOTE: be sure to run HISAT2 with the --dta option for alignment, or your results will suffer.

I will say use the option dta i.e. map reads once again.

ADD COMMENTlink modified 13 months ago • written 13 months ago by lakhujanivijay4.3k

Dear @Vijay Lakhujani, Hi and thank you. What do you think about my new script?

./hisat2 -p 6 -x  --dta ht2_base_salmon_genome -1 '/RNA_Seq_Data/C1_clean_left.fq' -2 '/RNA_Seq_Data/C1_clean_right.fq' -S '/RNA_Seq_Data/C1.sam' &> C1.sam.info

or I should add "--ss and --exon" to it, too?

ADD REPLYlink modified 13 months ago • written 13 months ago by Farbod3.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1601 users visited in the last hour