Question: Adding read group to bam files from multiplexed samples
0
gravatar for serpalma.v
12 weeks ago by
serpalma.v20
Germany
serpalma.v20 wrote:

Hello

I have 60 samples (samp1...samp60), each one was barcoded and then pooled (10 samples/pool, 6 pools).

Each pool was sequenced in 9 lanes.

This leads to 1080 fastq files ( 60 samples * 9 lanes * 2 (PE) ) and 540 bam files.

I want to do variant calling with GATK.

I went through these two very informative posts:

https://gatkforums.broadinstitute.org/gatk/discussion/6472/read-groups

Read Group In Sam/Bam Files: What Do They Exactly Describe?

Accordingly, I am trying to define the read groups for each bam file, as follows.

  • ID: flowcell ID and lane ID (i.e. HNTW5BBXX_1)
  • SM: the name of the sample (i.e. samp31)
  • PL: ILLUMINA
  • LB: lib_samp31
  • PI: insert size (i.e. 200)
  • PU: flowcell ID and lane ID and sample ID (i.e. HNTW5BBXX_1_samp31)

I would like to clarify the following:

  • Did I get something wrong interpreting the fields?
  • Could I exclude PU?, as it is not required by GATK, according to the link above. Do you usually include it anyway?

Thanks in advance!

bam picard gatk • 283 views
ADD COMMENTlink modified 22 days ago by Biostar ♦♦ 20 • written 12 weeks ago by serpalma.v20

Unless you have QC reasons to say that a lane did poorly, you should concatenate all 9 lanes together for each sample. Keeping them separate is doing you no favors. Merge the bams now before you do more.

ADD REPLYlink written 12 weeks ago by swbarnes24.2k

I read here that keeping bams separated during pre-processing is reasonable. And also, the way I understood it, for each sample, every bam file corresponds to a different read group, as they are derived from reads produced by different lanes.

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by serpalma.v20
1

5 year old recommendations are no longer relevant, just concatenate the lanes together.

ADD REPLYlink written 12 weeks ago by Devon Ryan85k

so then the read groups should be as follows:

  • ID: samp31
  • SM: samp31
  • PL: ILLUMINA
  • LB: samp31

Not sure about keepin PI and PU now...

Correct?

ADD REPLYlink modified 12 weeks ago • written 12 weeks ago by serpalma.v20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 767 users visited in the last hour