Entering edit mode
5.2 years ago
Z.0121
•
0
Hi, I am new in Bioinformatics. Could you guys tell me how to download TCGA-BLCA RNA SEQ RSEM and RPKM by using R or using the TCGA Data Portal or any other effective way?
Thank you!
I have already use the TCGAbiolinks to download the data. But when I using the GDCprepare command, it comes a error like this:
Could you please tell me where I did wrong?
Thank you!
Errors with TCGAbiolinks should be posted on the GitHub page as an 'Issue'. There are a few bugs in the program that are going unfixed.
For the RSEM and RPKM data with paired normal and tumor tissue, did I use the right command like below? Do I need any other filter?
query.blca.trans.pro <- GDCquery(project = "TCGA-BLCA", data.category = "Transcriptome Profiling", data.type = "Gene Expression Quantification", experimental.strategy = "RNA-Seq")
In addition, hope you can help to check whether it's right method I used to download the RSEM and RPKM with paired normal and tumor tissue. Thank you!
Did you get this from a tutorial?
I wrote this code refer to this website. But it does not specify which argument are necessary for the the RSEM and RPKM with paired normal and tumor tissue.
https://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/download_prepare.html#examples
I cannot be 100% certain for third-party packages like TCGAbiolinks. You should really contact the developers to have 100% certainty about what you are downloading. If you take a look at the issues page, however, you can see that there are many unresolved issues: https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues
This is why I never use these third-party programs. If I need TCGA data, I obtain it direct from the Genomics Data Commons Data Portal: https://portal.gdc.cancer.gov/
Thank you so much for your help!
You are welcome. I know that working with the TCGA data is not easy. However, please ask further questions if you want. If I were you, I would obtain the data direct from the Data Portal via the method that I mentioned earlier (look up). You can then post a new question if you require more help.
I have already tried to download the data from TCGA Data Portal but I don't know how to import that file to R? Do you have any suggestions? Thank you in advance!
Which files have you obtained?
I got the Transcriptome Profiling RNA Seq Data which named like this 2d6fc33e-c553-427e-9e1c-8008f694b0ce.htseq.counts; 5b39b3d7-2aa0-4dc8-b814-53e42fcc86fb.FPKM-UQ.txt;b142177f-f89e-4e5a-834b-a75e7ab0b618.FPKM.txt; 16c74720-66a9-4c6b-9450-3d12a28ca214.htseq.counts.
Thank you in advance!
The files with htseq.counts in their name contain raw counts. You should be able to bind these together into a single data-frame in R. There, you can input them to EdgeR or DESeq2 for normalisation.