Question: Mapping paired-end reads to a draft genome using HISAT2
0
gravatar for Farbod
12 weeks ago by
Farbod3.1k
Toronto
Farbod3.1k wrote:

Dear Biostars, Hi

I have the RNA-seq data of a salmon fish (3 cond1 and 3 cond2 as biological replicate and both are paired-ends -> 12 fastq files ) and I have done Trinity de novo assembly and DEG analysis on them.

Recently the draft genome of that salmon species have been released.

I want to run a genome-guided and then DEG analysis on the same data to compare the results.

Using many helps from Biostars and @Kevin Blighe, I have selected HISAT2, StringTie approach for this purpose.

First, I created an indexed genome using: ./hisat2-build -p 6 '/home/Salmon-genome/GCF_salmon_genome.fna' ht2_base_salmon_genome

and now I have 8 *.ht2 files and want to map my reads to that reference.

Q1: Please check below mapping script I have found from here to see if it is correct for my case or not. specially I need help about paired-end data mapping. what should I write instead of sample_1.fastq in this case that I have paired-end reads?

./hisat2 -p 6  -x path/to/reference/fileName -1 path/to/fastqFile/sample_1.fastq -2 path/to/fastqFile/sample_2.fastq -S /path/to/outDir/fileName.sam &> /path/to/outDir/fileName.sam.info

Q2: can I map all my 12 files to the reference genome in one script? how? (is it by using && and repeating the script, changing fastq file names?).

Thanks

ADD COMMENTlink modified 12 weeks ago by genomax55k • written 12 weeks ago by Farbod3.1k
3
gravatar for genomax
12 weeks ago by
genomax55k
United States
genomax55k wrote:

Q1: Sample_1.fastq will be replaced by R1 data file. Sample_2.fastq will be replaced by R2 data file.

Q2: You could follow parallel approach (Script to run blast locally with multiple files in a directory as queries ) or use this method Hisat2 multiple paired end reads

ADD COMMENTlink modified 12 weeks ago • written 12 weeks ago by genomax55k

Dear @genomax, Hi and thank you. I guess all other parameters in the script were fine.

would you please explain more what do you mean by "R1 data file" ?

My fastq files names are as :

J1-left.fq, J1-right.fq, J2-left.fq, J2-right.fq, J3-left.fq, J3-right.fq, H1-left.fq, H1-right.fq, H2-left.fq, H2-right.fq, H3-left.fq and H3-right.fq.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Farbod3.1k
1

left is likely equivalent to R1 and right should be R2. You can look in the read headers of left and right files. You may see something like

R1 - @EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG

R2 - @EAS139:136:FC706VJ:2:2104:15343:197393 2:Y:18:ATCACG
ADD REPLYlink written 12 weeks ago by genomax55k

Hi there, I have used "./hisat2 -p 6 -x ht2_base_salmon_genome -1 '/media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq' -2 '/media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq' -S H1.sam &> H1.sam.info" and then I received this warning in the H1.sam.info file:

Warning: Same mate file "/media/Seagate" appears as argument to both -1 and -2

Extra parameter(s) specified: "Backup", "Plus", "Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq",

"Backup", "Plus", "Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq" Note that if <mates> files are specified using -1/-2, a <singles> file cannot also be specified. Please run bowtie separately for mates and singles.

Error: Encountered internal HISAT2 exception (#1) Command:

/home/Desktop/salmon-genome-2018/hisat2-2.1.0/hisat2-align-s --wrapper basic-0 -p 6 -x ht2_base_salmon_genome -S H1.sam -1 /media/Seagate -2 /media/Seagate Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_left.fq Backup Plus Drive/Bioinformatics/RNA-Seq/RNA_Seq_Data/clean_reads/H1/H1_clean_right.fq (ERR): hisat2-align exited with value 1

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Farbod3.1k
2

You would be wise to remove spaces in file/directory paths. I suggest you rename Seagate Backup Plus Drive as Seagate_Backup_Plus_Drive

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by genomax55k

I know that this is not a bioinformatic question but do you have any idea how I can change the name of my external hard drive in Ubuntu 14.04?

I tried and get this error :

Cannot change label on mounted device of type filesystem:ntfs. (udisks-error-quark, 11)

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by Farbod3.1k
1

See if this helps: https://askubuntu.com/questions/731579/problem-in-renaming-the-partition-and-opening-a-file-through-terminal

If your drive is formatted using ntfs (windows file system) then are you able to write to it under ubuntu? As I recall that requires installation of some additional software (which you may have done).

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by genomax55k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1727 users visited in the last hour