Question: hisat2 --sra-acc with paired reads producing single read output
0
gravatar for avp25
15 months ago by
avp250
avp250 wrote:

Hello there,

I am trying to use the --sra-acc function from hisat2 with paired end data. I have installed both hisat2 and the sra-toolkit successfully. Indeed, the mapping works fine but the SAM output shows reads mapped as if they were single reads. My hisat2 command looks like:

hisat2  --no-mixed --no-discordant -x ../ref//hg38/genome --sra-acc <accession> -S output.sam

where <accession> is a single number, and this SRA accession links to both 1.fastq.gz and 2.fastq.gz

Is there any way to tell hisat2 that the accession refers to paired end reads?

Thanks!

Anna

hisat2 sra-tools • 719 views
ADD COMMENTlink modified 5 weeks ago by poojasethiya80 • written 15 months ago by avp250
1

Check if the SRA data really contain paired-end reads. I have seen single-end data marked as paired-end - in that case, it was a faulty upload. But bottom-line is don't trust SRA blindly.

ADD REPLYlink written 15 months ago by h.mon27k

Paired-end reads are aligned together. Many aligners drop the read designations from read name and may encode that information in SAM flags (83 and 147).

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax70k

I really don't know about hisat2 --sra-acc option, but while downloading the data using NCBI-SRA Toolkit I had the same issue. fastq-dump utility of NCBI-SRA Toolkit gave me single fastq file for paired entry. This issue has been solved by --split-files option provided by fastq-dump.

ADD REPLYlink written 13 months ago by Nitin Narwade420
0
gravatar for poojasethiya
5 weeks ago by
poojasethiya80
poojasethiya80 wrote:

For single end SRA data hisat2 gives summary statistics as:

HISAT2 summary stats:
Total reads: 27870948
                 Aligned 0 time: 11196555 (40.17%)
                 Aligned 1 time: 15284312 (54.84%)
                 Aligned >1 times: 1390081 (4.99%)
         Overall alignment rate: 59.83%

For paired end SRA data hisat2 gives summary statistics as:

HISAT2 summary stats:
        Total pairs: 9113937
                Aligned concordantly or discordantly 0 time: 4233427 (46.45%)
                Aligned concordantly 1 time: 4493249 (49.30%)
                Aligned concordantly >1 times: 351445 (3.86%)
                Aligned discordantly 1 time: 35816 (0.39%)
        Total unpaired reads: 8466854
                Aligned 0 time: 6882022 (81.28%)
                Aligned 1 time: 1428012 (16.87%)
                Aligned >1 times: 156820 (1.85%)
        Overall alignment rate: 62.24%

From these results it can be seen that hisat2 treats paired-end and single end data differently.

ADD COMMENTlink written 5 weeks ago by poojasethiya80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 825 users visited in the last hour