Get the ID's for Paired-end Reads from BAM file including /1 and /2
0
0
Entering edit mode
4.4 years ago
User000 ▴ 690

Hello,

I want to extract the ID's from a BAM file and I use the following command line:

samtools view {input.unmerged} | cut -f 1 | awk '!x[$0]++' > {output.txt}

This is the output.txt,

HISEQ1:105:C0A57ACXX:2:1105:12172:84568
HISEQ1:105:C0A57ACXX:2:1108:17762:41110

Now, I actually want to get the FASTQ of these ID's and I use:

seqkit grep -f output.txt myfile.fastq

I actually get and empty list, when I control my FASTQ file, I realised my ID's have /1 and /2 specified.

@HISEQ1:105:C0A57ACXX:2:1105:12172:84568/1
@HISEQ1:105:C0A57ACXX:2:1108:17762:41110/2

My question is there something missing when I am extracting the ID's from BAM file? Why the output.txt doesn't have /1 and /2 specified? Amd how to solve the issue?

EDIT: May be when I extract my unmapped reads I should adjust the -f flag?

samtools view -u -f 12 -F 256 -@ {threads} {input.merged} > {output.unmapped}
-f 204? (204=12+64+128)
samtools seqkit • 835 views
ADD COMMENT

Login before adding your answer.

Traffic: 3443 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6