Dear All,
I’m making a command line for HISAT following the manual instruction. What is not clear to me is how to use the genome index files. I downloaded from HISAT website the reference genome (genome.1.ht2… until genome.8.ht2)
My question is how to use this? the manual says “any of the index files up to but not including the final .1.ht2” can I just pick one, insert into the line command and remove thee others from the folder?
My command would be like
hisat2 -q -x genome -1 mymate1.fq -2 mymate2.fq -S -o resultpath
I know it's not correct.. but can you kindly help me to fix it?
Thank you
Hi lieven.sterk, there is something wrong in what I'm doing, you say "you need to provide the file name you used for the index" but I didn't do any index, I just downloaded the file from HISAT and obtained this folder with files genome.1.ht2 and so on.. then I use them in this line command because I understood that they are already indexed, is it right?
Thank you
OK, what you have downloaded is the indexed genome, so you don't need to index it yourself anymore.
Looking at the error you posted it does however not look like you have a problem with the index but rather with the input files of your reads. Are both files ( Treated_1_m1.fastq & Treated_1_m2.fastq ) present in the folder/location where you try to execute hisat2 ?
yes they are in the same folder, this is the entire line, I usually use the autocompletetion to make sure that everything is ok
I hope is not a problem due to Miniconda..
Thank you
Have you tried to execute this command line? As long as the relative paths are correct it should either run fine or will generate some useful error messages that we can debug.
Hello Genomax, yes, when I run this command line I have two errors, it doesn't like my mate files
Thank you
assuming that the path part is resolved then something must be off with your input files.
Can you post a
head
of both those files? Did you do any manipulations to those input files before using them as input here?Yes I did a manipulation, this two fastq files derive from a Bam files, maybe is for that? with salmon they were ok tho.. Here the heads
Thank you
hmm, looks fine at first sight.
however, this can't be the complete output from this
cat | head
cmdline, no? (unless you only have 1 read in it). If the former can we ask to always post the complete output of commands and also always the exact output from the command you describe. thxsure, I was trying to to make things more simple, here is the command + output thank you to all of you
thx! and it still looks fine ;)
for future reference: if you simply mention you shortened the output it's fine as well (or do
|head -4
off course)so both files seems to be OK. Can you, just for testing, switch them around in your cmdline ( -1 Treated_1_m2.fastq and -2 Treated_1_m1.fastq ) ?
Can also you run the cmdline but then with a single input file only ?
Here is the cmdline with two files swapped and the result didn't change :/
If I run a single file with this command line
I have a list of hisat2 options and functions, probably is wrong this command ?
Thanks
I think that the
-S
option needs a value, as in a filename to write the sam output to. If you omit it , the order of the options is altered apparently and now the value for-S
because-1
, which could explain the error/behaviour.As an alternative check for this: there likely will be a file created in your directory called
-1
Lieven!!! probably I solved the issue !!!! geeze I can't believe! have a look please
Now I have a summary.bam which I can rename and use for further analysis (I guess)
Thanks
Yes, that looks how it is supposed to look like!
As I pointed out the
-S
needed a correct value (sample.sam
in this case thus)