Question: RNA-Seq raw fastq files from TCGA
0
gravatar for oriolebaltimore
11 months ago by
United States
oriolebaltimore30 wrote:

Dear group,

I am looking for raw FASTQ files for RNA-Seq TCGA data. The BAM files were made using reads that map only to known genes. I am looking to get FASTQ files that were not filtered in anyway to retain reads mapping to known genes only.

I have access to Level 1 data through an approved protocol.

Thanks Adrian.

rna-seq tcga fastq • 1.6k views
ADD COMMENTlink written 11 months ago by oriolebaltimore30

Sorry - I forgot to add that - is it possible to get raw FASTQ files from TCGA. Thanks

ADD REPLYlink written 11 months ago by oriolebaltimore30
1

Are you sure about that? You can see the command line used to map the reads in the BAM header. Nowhere do I see anything that suggests that only reads mapping to known genes were kept? Only known genes were used when quantifying, but that's different.

Fastq's only exist in the TCGA legacy archive, whic hI don'tthink contains everything.

ADD REPLYlink written 11 months ago by i.sudbery1.7k

that's correct. you just need to convert the supplied bam files to raw reads available through GDC.

here's an pipeline example: https://github.com/mforde84/TCGA-BRCA-RNAseq-realignment-pipeline

also from experience, converting bam to fastq is a bottleneck. picard has an option but it's really slow. the scripting provided above has a custom solution called fasty to do this. however i couldn't locate my source code. instead you could use something like the following which should be as fast: https://github.com/arq5x/bedtools2/blob/master/src/bamToFastq/bamToFastq.cpp

ADD REPLYlink modified 11 months ago • written 11 months ago by mforde841.0k

The legacy archive does contain all fastq files for RNA-Seq data. They are the TARGZ format.

Link to GDC Legacy Archive

ADD REPLYlink modified 11 months ago • written 11 months ago by nwon20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1032 users visited in the last hour