Question: NIH roadmap epigenomics data Bam files
1
gravatar for Saad Khan
2.5 years ago by
Saad Khan310
United States
Saad Khan310 wrote:

Hi I was wondering if anyone has access to the bam files of chip-seq data from NIH roadmap project (http://egg2.wustl.edu/roadmap/web_portal/processed_data.html#ChipSeq_DNaseSeq)

They only have bed files or tagalign files available for it. If anyone has converted those tagalign files to bam files and can provide me access to it kindly let me know.

roadmap-epigenomics nih • 1.2k views
ADD COMMENTlink modified 2.5 years ago by Denise - Open Targets4.7k • written 2.5 years ago by Saad Khan310
0
gravatar for Denise - Open Targets
2.5 years ago by
UK, Hinxton, EMBL-EBI
Denise - Open Targets4.7k wrote:

It seems the BAMs are available from NIH Roadmap Epigenomics Project Data Listings page. The original paper also points to the sequencing data available from the European Nucleotide Archive (ENA) under the study no. PRJEB4795.

ADD COMMENTlink written 2.5 years ago by Denise - Open Targets4.7k

Bam files are only available for some tissues for Chip-seq data. For others only SRA and bed files are available. I don't want to start from scratch with SRA files and do the pipeline all over again. I was wondering if someone has already done it or has successfully converted the tagalign or bed files to bam. Tagalign files don't have much information about mapping quality etc but the bedfiles seem to have some numbers along with read id I don't know what those numbers are but if anybody does do let me know. The bed files for unconsolidated epigenomes is available and looks something like this ;- ` chr1 10084 10283 62BU8AAXX110111:4:104:1337:6620 0 -

chr1 12881 13080 62BU8AAXX110111:4:72:15560:1099 0 -

chr1 16276 16475 62BU8AAXX110111:4:23:16833:6138 0 -

chr1 48005 48204 62BU8AAXX110111:4:108:5179:18053 0 -

` I am trying to do some comparisons with other data using csaw. Unfortunately csaw only takes bamfiles as input.

With bedToBam I need to specify a mapping quality as well thus I am confused as to what to do. Should I just use consolidated tagalign files which look something as given below and give each read a mapping quality > 10 with bedtobam. -Tagalign files. ` chr1 10153 10189 N 1000 -

chr1 10154 10190 N 1000 +

chr1 10156 10192 N 1000 -

chr1 10156 10192 N 1000 - ` Can someone let me know please!

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Saad Khan310

I converted the bed files to bam files. You don't have to supply the mapping quality in case of of conversion of bed files to bam files if you are using bedtobam in bedtools package. Here is how you may convert it for hg19 dataset:

bedtools bedtobam -i $sortedBedFile -g hg19 > $bamFile

Also, in case you are interested in the quality of the dataset then use phantompeakqualtools that has been extensively used in ENCODE project.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Ar800

Hi, I was hoping you could help - I'd like to know if you then solved this by going ahead and converting the tagAlign to bam files, or eventually found the original bams or something else entirely?

ADD REPLYlink written 9 months ago by epaminonda10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1687 users visited in the last hour