Warnings in the Hisat2 alignment output files
1
0
Entering edit mode
3.6 years ago
newbie ▴ 120

Before doing the alignment, I used fastqc on the fastq.gz files and observed adapter content.

enter image description here

So, I removed the adapter content like below:

cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -o tr_sample_R1.fastq.gz -p tr_sample_R2.fastq.gz sample_R1.fastq.gz sample_R2.fastq.gz

enter image description here

And then used the fastqs for the alignment with Hisat2. I observed some warnings in Hisat2 output file. The alignment was done and I can also see the mapping percentage 92%, but what are these warnings in the file?

Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:30005:12226 1:N:0:TTAGGC' because length (1) <= # seed mismatches (0)
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:30005:12226 2:N:0:TTAGGC' because length (1) <= # seed mismatches (0)
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:30005:12226 1:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:30005:12226 2:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:22343:14951 1:N:0:TTAGGC' because length (0) <= # seed mismatches (0)
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:22343:14951 2:N:0:TTAGGC' because length (0) <= # seed mismatches (0)
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:22343:14951 1:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:22343:14951 2:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:28250:19698 1:N:0:TTAGGC' because length (0) <= # seed mismatches (0)
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:28250:19698 2:N:0:TTAGGC' because length (0) <= # seed mismatches (0)
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:28250:19698 1:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:28250:19698 2:N:0:TTAGGC' because it was < 2 characters long
Warning: skipping mate #1 of read 'ST-E00211:161:HMHCYCCXX:1:1101:32197:22915 1:N:0:TTAGGC' because length (0) <= # seed mismatches (0)
Warning: skipping mate #2 of read 'ST-E00211:161:HMHCYCCXX:1:1101:32197:22915 2:N:0:TTAGGC' because length (0) <= # seed mismatches (0)

Is there anything to worry about these warnings?

RNA-Seq hisat2 cutadapt alignment • 1.7k views
ADD COMMENT
0
Entering edit mode

Do you have reads that have no sequence in them? Check grep -A 3 ST-E00211:161:HMHCYCCXX:1:1101:28250:19698 in both R1/R2 files and see if that is the case.

ADD REPLY
0
Entering edit mode

When I use that grep on R1/R2 files I don't have anything.

ADD REPLY
1
Entering edit mode
3.6 years ago

I think what's happening is by trimming, you've made some reads way too short. But if they were that much adapter, there's no saving them, so since these are warnings, and not errors, you could just let them go.

Or, you could tell cutadapt to omit any read shorter than some threshold. This is likely to wreck proper read pairing between your two files, so you'd have to fix that. Cutadapt might handle this for you, read its docs about dealing with paired end data.

ADD COMMENT
0
Entering edit mode

yes these are only warnings and I don't see any errors. So, there won't be any problem with number of reads right?

ADD REPLY
0
Entering edit mode

Did you get the final result?Was there any problem with it?

ADD REPLY

Login before adding your answer.

Traffic: 1748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6