Hello everyone,
I downloaded Bladder cancer data from TCGA . I extracted the sample id with this code:
head(Blca_res$id)
output: 'a8c61671-89cb-43bc-8c88-5c107954d11c,''b03b7b9b-00ef-4e0d-bac2-0b1059d57a87,''bf98764d-1604-4a14-8e06-1c785a085db9,''c0bc697a-ac64-4605-9abc-f0fe85eb481a,''bd52f6c8-6f8b-4056-8a3e-8cdc96644952,''ab504dbf-e1f0-46d2-83f9-0f4066055c71'
I wrote this to get same from clinical data:
head(tcgaBlca_data@colData$sample_id)
output: 'f9bd70b2-6cde-48e5-9f0d-55d86ccfeba8,''3cae49a3-6deb-40f9-84cc-68b9b53543ff,''015e6b08-ab3c-4d1d-99e4-77b5e10bd7fc,''f09e1eeb-bcd5-4dba-92f0-7d4b34b81ce7,''0ac8e522-3c64-42f2-a66f-bd40530a328a,''3c71158d-98ff-4ef5-923f-ba31a25036ec'.
There are more than 60,000 rows with this sampl_id's. What I want to find out is if each sample Id in Blca_res$id are same with tcgaBlca_data@colData$sample_id. e.g, is 'a8c61671-89cb-43bc-8c88-5c107954d11c from Blca_res$id also in tcgaBlca_data@colData$sample_id?
Any suggestion on how I can implement this with lines of code in R?
Regards,
is the format of your
headoutput correct? Do the sample ids actually have commas in the string?No. There no commas. but a dot like this
.but, I have sorted it using a more readable column in the data.
Thanks