Obtaining bam files from ICGC or TCGC
2
0
Entering edit mode
8.1 years ago
H.Hasani ▴ 990

Dear Biostars,

this might sound an old, obvious, or even a known question, but for me it remains open till this moment.

I'm trying to get the bam files of pair samples cancer data (normal - tumor, DNA-RNA) from the ICGC project, in order to test my methods. Unfortunately, I always end up empty handed for rights issues, while it seems no issue to others. The propaganda says the data is now available to the public, something I can not verify till this moment!

Could someone PLEASE explain the problem and what should I do to get the "open accessed" data

Thank you very much in advance!

RNA-Seq SNP genome • 6.1k views
ADD COMMENT
0
Entering edit mode

I only see protected samples at ICGC portal data repositories (no links for BAM). Can you post an example of open data link?

ADD REPLY
0
Entering edit mode

These are the final results of their analysis releases

ADD REPLY
0
Entering edit mode

I checked a few projects from current release. I only see tsv files, no BAM, as far as I can see.

ADD REPLY
0
Entering edit mode

Yes! that's exactly my problem, and that's what I meant by "propaganda"

ADD REPLY
1
Entering edit mode

Have you tried to get them from CGHub?

ADD REPLY
0
Entering edit mode

I'm checking it now!

ADD REPLY
2
Entering edit mode
8.1 years ago
ivivek_ngs ★ 5.2k

you do not have acess to BAM files from TCGA , take a look at this link here . However some data is available in CGHub , you can see this link where it says how you can make use of it to download bam files. You need to get some access to download the bams from here

ADD COMMENT
0
Entering edit mode

I will report something that I noticed in my answer. The link I posted does not exist anymore.

ADD REPLY
0
Entering edit mode
8.1 years ago
Danielk ▴ 640

@h.hasani,

The reason you need approval to access the BAMs is that they are considered personal data much like names, social security numbers and addresses. Therefore, you need to apply for access with information on what you will do with the data, and local data handling policies.

This applies to the vast majority of human genome sequencing and GWAS analysis. Genetic data == personal data. However, completely free access is given to for example gene expression measurements (again - not the BAMs, but the expression values) and somatic mutations, since they cannot be used to identify individual patients. You can for example check cbioportal.org or tumorportal.org for some of the TCGA data.

It is not "propaganda", but simply abiding by local laws. My personal opinion is that the work that TCGA and ICGC have done in making it possible to access both the processed and raw data (with approval) is quite amazing for the research community. All cred to them for their hard work.

I am not affiliated with TCGA or ICGC but have previously applied for access as described.

Daniel

ADD COMMENT
0
Entering edit mode

Thanks! The concept of private policy is indeed important.

Genetic data does not necessarily reveal the personal data , the farthest you might get is age, sex and race which are already given. I would argue getting names and those really personal information you've mentioned should not be passed one even with the mentioned access policy!

Still, if you have a bam file with id XX123 or id YY254 what difference it makes to you? If methods are your concern, having the final results of somebody else's work will not help you, and in this case ICGC would have no advantage.

ADD REPLY
0
Entering edit mode

Just to reply to your comment if one has access to the controlled data then that person should be able to get hold of the BAM files of RNA-Seq for specific studies right?

ADD REPLY

Login before adding your answer.

Traffic: 2685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6