Question: duplicate UUIDs, yet unique legacy UUIDs (and count data) in TCGA data
0
gravatar for david.peeney
16 months ago by
david.peeney30
david.peeney30 wrote:

Hi

I have recount2 data from breast TCGA RNA-seq. The recount2 data file IDs are TCGA legacy UUIDs. Upon converting these legacy UUIDs to harmonized UUIDs, there are 5 duplicate aliquot UUIDs and aliquot barcodes (after exclusion of FFPE samples).

for example: legacy UUID > harmonized aliquot UUID > aliquot barcode

a907f2d1-92ad-4a1b-b439-20e5a7347d5b > 2a4747b5-1eeb-45b1-9e92-0e0e3d7a9c1b > TCGA-A7-A26F-01A-21R-A169-07

eb068925-2dcc-4e18-838f-903ac8d2b661 > 2a4747b5-1eeb-45b1-9e92-0e0e3d7a9c1b > TCGA-A7-A26F-01A-21R-A169-07

These 5 duplicate harmonized UUIDs have different FASTQ files and count data from the legacy archive. Does anybody have any recommendations on how to handle these, and why the same aliquot may have been analyzed twice?

rna-seq gdc tcga • 366 views
ADD COMMENTlink modified 16 months ago by Kevin Blighe61k • written 16 months ago by david.peeney30
2
gravatar for Kevin Blighe
16 months ago by
Kevin Blighe61k
University College London
Kevin Blighe61k wrote:

It is difficult to know without speaking to the people who were actually involved in sequencing this particular patient's samples, which was performed at UNC, I can see. The likelihood is that they simply had money to spend toward the end of the project and decided to sequence whatever else they could. Many (or all?) funding bodies prefer you to spend all of the money that they have invested in you.

My preference would be to include them all in your analysis and check what happens when you, for example, perform PCA and generate a bi-plot. If they all line-up on top of each other in the plot space, then you can justify keeping one or all of them. If they do not group together, then there is an issue.

There are many cases like this in the TCGA. Most just involve 'executive' decisions by you as the analyst as you work through everything (and obviously you should make note of it).

ADD COMMENTlink written 16 months ago by Kevin Blighe61k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1783 users visited in the last hour