Closed:Unmapped reads to fastq (hisat2 / samtools / bedtools) not matching pair-end reads
0
0
Entering edit mode
5.0 years ago

I want to obtain a fastq file containing reads (RNASeq) that failed to map to a reference genome.

hisat2 -p 3 -x index -1 TD1.R1.fastq.gz -2 TD1.R2.fastq.gz -S out.sam --un-conc out.un.sam --summary-file out.info

I obtained that way 4 files (.sam + .info + out.un.1.sam + out.un.2.sam). Those ".un" files seems like the paired end reads that failed to map. The problem is that the bam2fastq scripts out there want only one bam and create two fastq files.

$ bedtools bamtofastq -i aln.qsort.bam \
                      -fq aln.end1.fq \
                      -fq2 aln.end2.fq

What can I use to take both .un files and create the two fastq files that I need? thanks!

Update: At the time I do the following

#align against first genome
hisat2 -p 3 -x index/svevo -1 reads/TD1.R1.fastq.gz -2 reads/TD1.R2.fastq.gz -S data/landic_svevo_1.sam --un-conc data/landic_svevo_1.unmapped.sam --summary-file data/landic_svevo_1.sam.info
# unmapped files by hisat2 are broken, error when converting to bam

#get unmapped reads with samtools
samtools view -h -f 4 data/landic_svevo_1.sam > data/landic_svevo_1.un.sam

#convert to bam
samtools view -b -S data/landic_svevo_1.un.sam > data/landic_svevo_1.un.bam

#get fastq from unmapped bam
samtools fastq -1 data/landic_svevo_1.un.1.fastq -2 data/landic_svevo_1.un.2.fastq data/landic_svevo_1.un.bam

#align unmapped against second genome
hisat2 -p 3 -x index/zavitan -1 data/landic_svevo_1.un.1.fastq -2 data/landic_svevo_1.un.2.fastq -S data/landic_un_zavitan_1.sam --un data/landic_un_zavitan_1.un.sam --summary-file data/landic_un_zavitan_1.sam.info

But get the next error with hisat2 at the end:

Error, fewer reads in file specified with -1 than in file specified
with -2 terminate called after throwing an instance of 'int' Aborted
(core dumped)
transcriptomics unmapped • 464 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6