how to split a coordinate-sorted bam file by read group
1
0
Entering edit mode
5.8 years ago

Hi, I have a coordinate-sorted bam file, and its read group information in the header is:

@RG ID:0 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:1:none

@RG ID:1 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:2:none

@RG ID:2 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:3:none

@RG ID:3 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:4:none

@RG ID:4 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:5:none

@RG ID:5 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:6:none

@RG ID:6 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:7:none

@RG ID:7 PL:ILLUMINA SM:COLO829_Normal_Tgen PU:H0CGCADXX:8:none

I want to extract only the reads with ID:0, and I tried commands:

samtools view -b bam -r '@RG\tID:0\tPL:ILLUMINA\tSM:COLO829_Normal_Tgen\tPU:H0CGCADXX:1:none' ~/mixted.bam > rg_0.bam

and

samtools split -f '@RG\tID:0\tPL:ILLUMINA\tSM:COLO829_Normal_Tgen\tPU:H0CGCADXX:1:none' ~/mixted.bam > rg_0.bam

I just got the header information in the output bam file.

Can someone help me with this? Thank you!

sequencing • 5.3k views
ADD COMMENT
2
Entering edit mode
5.8 years ago

Hello jing.mengrabbit,

you are using the command incorrect. You must pass the ID value of the the RG line to samtools view.

$ samtools view -b -r 0 ~/mixted.bam > rg_0.bam

samtools split have no option to just get one read group. It creates a new file for each read group it finds.

samtools split [options] merged.sam|merged.bam|merged.cram

Splits a file by read group.

Options:

-u FILE1
Put reads with no RG tag or an unrecognised RG tag into FILE1

-u FILE1:FILE2
As above, but assigns an RG tag as given in the header of FILE2

-f STRING
Output filename format string (see below) ["%*_%#.%."]

-v
Verbose output

fin swimmer

ADD COMMENT
0
Entering edit mode

Thanks for your reply. It works now with your help!

ADD REPLY
1
Entering edit mode

Fine if I could help you.

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

Upvote|Bookmark|Accept

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6