Question: splice junction information with hisat2
1
gravatar for blooming.daisy333
3 months ago by
blooming.daisy33310 wrote:

I am newbie to linux and NGS. Can anyone help me out how to get information about splice junctios using HISAT2??? the command im using is giving information in single file about alignment in SAM format. the command is as follows:

./hisat2 -p 64 --max-intronlen 10000 -x /data/memona/hisat2-2.1.0/hisat_index -1 /data/memona/SRR959590_A_1P.fq -2 /data/memona/SRR959590_A_2P.fq -S /data/memona/results/hisat_align.sam &
next-gen • 272 views
ADD COMMENTlink modified 3 months ago by Juke-341.2k • written 3 months ago by blooming.daisy33310
2
gravatar for Juke-34
3 months ago by
Juke-341.2k
Sweden
Juke-341.2k wrote:

Hi blooming.daisy333,

It is explained in the manual. You have to use the option --novel-splicesite-outfile. Be careful there is an error how left splice sites are reported as I mentioned here.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Juke-341.2k

Hi Juke,

thanks for the kind guidance. however, im still unable to get the splice junction information despite of using --novel-splicesite-outfile command.

here is the command that i used. kindly point out the mistake and suggest the solution:

./hisat2 --np 0 --pen-noncansplice 10000000 --min-intronlen 20 --max-intronlen 10000 --novel-splicesite-outfile /data/memona/hisat2-2.1.0/result/ --rna-strandness RF --dta -p64 --summary-file -x /data/memona/hisat2-2.1.0/hisat_index -1 /data/memona/Trimmomatic-0.36/SRR959591_E_1P.fq -2 /data/memona/Trimmomatic-0.36/SRR959591_E_2P.fq -S /data/memona/hisat2-2.1.0/result/hisat_align.sam

thank you so much

ADD REPLYlink written 3 months ago by blooming.daisy33310
2

You’re welcome. The problem is you provided a path but no file name. Instead of “/data/memona/hisat2-2.1.0/result/” do “/data/memona/hisat2-2.1.0/result/splice_sites.tsv” and it should be fine.

ADD REPLYlink written 3 months ago by Juke-341.2k
1

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink written 3 months ago by WouterDeCoster31k

thanks for the kind guidance. yes it has produced the output but is giving only 3 fields given below:;

chr1    329728  329839  -
chr1    330066  330757  -
chr1    581256  581357  +

while not mentioning the canonical/non canonical status and doner accepter nucleotides which are important to classify the site.

can you please share the script/tool or command that you used to extract the nucleotide information of doner and accepter site like AT/AC, GT/AT, CT/GC etc...

further is it important to build the index with --ss and --exon options to determine the splice sites...???? or it is OK if the index is built without using these options???

ADD REPLYlink written 3 months ago by blooming.daisy33310
1

I don’t know the use of those options but for sure it is fine without to determine the spicing sites.

For the extraction of spicing sites I have used fasta_domainExtractor_JD.pl from the NBIS/GAAS repository but it is not really adapted for what you want. You can get inspiration from this script to implement what you reallly want.

ADD REPLYlink written 3 months ago by Juke-341.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1619 users visited in the last hour