Read group: ID, PU definition and multi-lane (same sample)
0
0
Entering edit mode
7 weeks ago
emmanouil.a ▴ 80

Hi,

I'm looking for a confirmation about what I'm doing, if correct.

1) I found several definitions of ID and PU and now I'm going to use this:

An example: @A00155:140:HHTKFDSXX:1:1101:3423:1000 1:N:0:CAGTGACT+CGAGGCGT

PU1=A00155 ### instrument
PU2=140 ### run
FL=HHTKFDSXX ### flowcell
LN=1 ### lane
LB ### library ID

ID=${PU1}.${PU2}
PU=${FL}.${LN}.${PU2}

2) when I have multilane data I saw that LB is important for the MarkDuplicates step and ID/PU for BQSR.

  • For MarkDuplicates I have to give as input all bam files from the same LB (of the same sample), correct?

  • When I have two libraries for the same sample I perform MarkDuplicates for each library and then I give as input both files (outputs of MarkDuplciates) at BQSR, correct?

Many thanks for your time!

multilane group_read RG • 61 views
ADD COMMENT

Login before adding your answer.

Traffic: 2198 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6