Hi,
I am completely new to this, any help will be greatly appreciated.
After running hisat2, I get
“Error, fewer reads in file specified with -1 than in file specified with -2
libc++abi.dylib: terminating with uncaught exception of type int
(ERR): hisat2-align died with signal 6 (ABRT) “
What does this mean?
I am trying to match two sets of single-end reads against the reference genome (taken from ensemble)
I have first built an index:
hisat2-build someSpecies.toplevel.fa someSpeciesIndex
After extracting splice-sites:
hisat2_extract_splice_sites.py someSpecies.gtf > someSpecies.txt
I run hisat2:
hisat2 -x someSpeciesIndex --known-splicesite-infile someSpecies.txt -p 3 -1 ourRead1.fq -2 ourRead2.fq > mapped_reads.sam
Then I get the above message. The mapped_reads.sam =10,36GB
What am I doing wrong?
It is single-read, but even when I specify rna-strandness R I get the same answer. Any help would be greatly appreciated
Why are you using the option for paired-end reads if you have single-end data?
According to https://daehwankimlab.github.io/hisat2/manual/ we should use --rna-strandness F (or R) if we have single-end reads?
RNA-strandedness does not per se have anything to do with single-end reads. You need to know if the libraries you are working with were prepared using a standed protocol (which preserves information about the strand of the DNA the RNA came from).
Looks like you don't have the same number of reads in Read 1 and Read 2 file. This can happen if you did your scanning/trimming independently. You should always trim paired end reads together to avoid getting reads out of sync.
You can use
repair.sh
from BBMap suite to bring your paired-end reads back in sync and remove any singletons. A guide is available here.Sorry, I don't think this is the answer: I am trying to match two sets of reads against the reference genome: why should it matter if I don't have the same number of reads in Read 1 and Read 2?