Question: Multi-part TCGA BAMs?
0
gravatar for lucas.lochovsky
3 months ago by
lucas.lochovsky0 wrote:

I'm working with TCGA WGS BAMs through the Google Cloud Platform, and I've seen that there are files where the barcode is identical, but there is an additional number appended to the filename before the .bam extension. For example,

G32450.TCGA-FD-A3N5-01A-11D-A21A-08.1.bam

G32450.TCGA-FD-A3N5-01A-11D-A21A-08.3.bam

G32450.TCGA-FD-A3N5-10A-01D-A21A-08.2.bam

G32450.TCGA-FD-A3N5-10A-01D-A21A-08.4.bam

Do these correspond to multiple parts of one BAM that should be merged before analysis? Or is this something else?

cancer bam data • 170 views
ADD COMMENTlink modified 3 months ago by Kevin Blighe35k • written 3 months ago by lucas.lochovsky0
0
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe35k
Republic of Ireland
Kevin Blighe35k wrote:

These 2 are tumour, based on the part that I have put in bold:

  • G32450.TCGA-FD-A3N5- 01 A-11D-A21A-08.1.bam
  • G32450.TCGA-FD-A3N5- 01 A-11D-A21A-08.3.bam

These 2 are from normal tissue:

  • G32450.TCGA-FD-A3N5- 10 A-01D-A21A-08.2.bam
  • G32450.TCGA-FD-A3N5- 10 A-01D-A21A-08.4.bam

All of these, however, are from the Broad Institute, as judged by the final number 08,

Unless otherwise stated, the assumption, based on the 'rules' of naming samples, is that these are from the same aliquot (1 tumour aliquot; 1 normal aliquot).

Check the BAM [SAM] headers to see if they have also been processed in the same way. The commands used for alignment should be encoded in the header. In some situations, I've spotted TCGA BAMs from the same project aligned to different genomes.

Kevin

ADD COMMENTlink written 3 months ago by Kevin Blighe35k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2367 users visited in the last hour