Question

Have anybody converted exome bam files to .tdf files for visualization in IGV?

1

Entering edit mode

9.8 years ago

ivivek_ngs ★ 5.2k

Dear All,

I am trying to view my exome bam files in IGV, it seems the bam files are too huge (20 GB) each so this is a big constraint to get them viewed in IGV. I got to know that for visualization of large datasets it is useful to convert them to count format or preferably to .tdf format. I have seen the documents and it says this is usually used in case of RNA-Seq and CHIP-Seq. Now the want to use it for exome-seq as well. My data is paired end and with 100bp reads. The coverage is 70X for the samples. How to change the -e parameter for generating the .tdf file for my samples

igvtools count -z 5 -w 25 -e 250 input.bam out.bam.tdf hg19

Can anyone give me suggestions?

sequencing alignment next-gen SNP • 7.5k views

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

Have you tried just sorting and indexing? There's usually no need to make a tdf file.

ADD REPLY • link 9.8 years ago by Devon Ryan 104k

score 1 · Answer 1 · 2017-06-22

Basically -e adds extra coverage to your reads, which is annoying and can be misleading. If the sequencing is paired-end, you'll see both reads, so there's no real reason to extend the coverage past what's actually seen in the reads.

I would say just set it to 0 for all applications.

Go ahead and make three TDF files, with -e of 0, 100, and 200, you'll see what I'm talking about.

What's more important/tricky is marking duplicates before you make the TDF. You should mark them (with Picard MarkDuplicates, not samtools) for WES, WGS, and ChIP-seq, and should NOT mark them for RNA-seq and amplicon sequencing.

Ram · Answer 2 · 2014-06-27

0

Entering edit mode

9.8 years ago

Martombo ★ 3.1k

In the manual, it states that the -e option should be set to the average fragment length of the library minus the average read length.

ADD COMMENT • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by Martombo ★ 3.1k

0

Entering edit mode

Yes I already did that, but usually it is used for RNA-Seq and ChIP-Seq studies and I did not find anything on exome seq studies. I am currently using the same command listed as I know my fragment length is 100 bp and assuming my library avg. fragment length around 350 , since for illumina hi seq usually the library is between 250-500 bp. lets see what the results come up. Thanks.

ADD REPLY • link 9.8 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

You can probably tune this parameter to have a different resolution for the counts. if you want to have a higher resolution you can decrease the value, though increasing the file size.

ADD REPLY • link updated 2.5 years ago by Ram 43k • written 9.8 years ago by Martombo ★ 3.1k