Question: TCGA Filename Breakdown
0
gravatar for stanley.ju
2.1 years ago by
stanley.ju0
stanley.ju0 wrote:

What do the different components of a TCGA data filename mean?

For example, here's one file I was looking at: HORNS_p_TCGA_b110_113_SNP_N_GenomeWideSNP_6_C10_772388.grch38.seg.v2.txt

Some parts are self-explanatory. This comes from a genome-wide SNP array, I assume the 6 is Affymetrix 6.0. But what does "HORNS" mean? And "b110"? Etc.

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by stanley.ju0

Where did you get this file from? Also, b110 is incomplete, I think: it should be considered with the 113 that follows, b110_113.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by RamRS28k

I see--so b110_113 is some sort of sample marker?

This particular file name came from a download from TCGA Data Portal --> Uterine Corpus Endometrial Carcinoma --> Copy Number Variation. It was just the first file in the archive after I downloaded (straight from the web, since they're pretty small) all of the CNV data for endometrial carcinoma.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by stanley.ju0
1

This paper might be of help, but I don't know how useful it is to decipher TCGA filenames.

https://link.springer.com/protocol/10.1007/978-1-4939-3578-9_6

ADD REPLYlink written 2.1 years ago by RamRS28k

HORNS could be code for an institute of sample origin or something like that; I wouldn't worry too much about it.

ADD REPLYlink written 2.1 years ago by RamRS28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 679 users visited in the last hour