Question: hisat2 --sra-acc with paired reads producing single read output
0
gravatar for avp25
2.0 years ago by
avp250
avp250 wrote:

Hello there,

I am trying to use the --sra-acc function from hisat2 with paired end data. I have installed both hisat2 and the sra-toolkit successfully. Indeed, the mapping works fine but the SAM output shows reads mapped as if they were single reads. My hisat2 command looks like:

hisat2  --no-mixed --no-discordant -x ../ref//hg38/genome --sra-acc <accession> -S output.sam

where <accession> is a single number, and this SRA accession links to both 1.fastq.gz and 2.fastq.gz

Is there any way to tell hisat2 that the accession refers to paired end reads?

Thanks!

Anna

hisat2 sra-tools • 1.1k views
ADD COMMENTlink modified 10 months ago by poojasethiya80 • written 2.0 years ago by avp250
1

Check if the SRA data really contain paired-end reads. I have seen single-end data marked as paired-end - in that case, it was a faulty upload. But bottom-line is don't trust SRA blindly.

ADD REPLYlink written 2.0 years ago by h.mon29k

Paired-end reads are aligned together. Many aligners drop the read designations from read name and may encode that information in SAM flags (83 and 147).

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by genomax84k

I really don't know about hisat2 --sra-acc option, but while downloading the data using NCBI-SRA Toolkit I had the same issue. fastq-dump utility of NCBI-SRA Toolkit gave me single fastq file for paired entry. This issue has been solved by --split-files option provided by fastq-dump.

ADD REPLYlink written 23 months ago by Nitin Narwade440
0
gravatar for poojasethiya
10 months ago by
poojasethiya80
poojasethiya80 wrote:

For single end SRA data hisat2 gives summary statistics as:

HISAT2 summary stats:
Total reads: 27870948
                 Aligned 0 time: 11196555 (40.17%)
                 Aligned 1 time: 15284312 (54.84%)
                 Aligned >1 times: 1390081 (4.99%)
         Overall alignment rate: 59.83%

For paired end SRA data hisat2 gives summary statistics as:

HISAT2 summary stats:
        Total pairs: 9113937
                Aligned concordantly or discordantly 0 time: 4233427 (46.45%)
                Aligned concordantly 1 time: 4493249 (49.30%)
                Aligned concordantly >1 times: 351445 (3.86%)
                Aligned discordantly 1 time: 35816 (0.39%)
        Total unpaired reads: 8466854
                Aligned 0 time: 6882022 (81.28%)
                Aligned 1 time: 1428012 (16.87%)
                Aligned >1 times: 156820 (1.85%)
        Overall alignment rate: 62.24%

From these results it can be seen that hisat2 treats paired-end and single end data differently.

ADD COMMENTlink written 10 months ago by poojasethiya80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 737 users visited in the last hour