Have anybody converted exome bam files to .tdf files for visualization in IGV?
2
1
Entering edit mode
8.6 years ago
ivivek_ngs ★ 5.2k

Dear All,

I am trying to view my exome bam files in IGV, it seems the bam files are too huge (20 GB) each so this is a big constraint to get them viewed in IGV. I got to know that for visualization of large datasets it is useful to convert them to count format or preferably to .tdf format. I have seen the documents and it says this is usually used in case of RNA-Seq and CHIP-Seq. Now the want to use it for exome-seq as well. My data is paired end and with 100bp reads. The coverage is 70X for the samples. How to change the -e parameter for generating the .tdf file for my samples

igvtools count -z 5 -w 25 -e 250 input.bam out.bam.tdf hg19


Can anyone give me suggestions?

sequencing alignment next-gen SNP • 6.8k views
0
Entering edit mode

Have you tried just sorting and indexing? There's usually no need to make a tdf file.

1
Entering edit mode
5.6 years ago
predeus ★ 1.8k

Basically -e adds extra coverage to your reads, which is annoying and can be misleading. If the sequencing is paired-end, you'll see both reads, so there's no real reason to extend the coverage past what's actually seen in the reads.

I would say just set it to 0 for all applications.

Go ahead and make three TDF files, with -e of 0, 100, and 200, you'll see what I'm talking about.

What's more important/tricky is marking duplicates before you make the TDF. You should mark them (with Picard MarkDuplicates, not samtools) for WES, WGS, and ChIP-seq, and should NOT mark them for RNA-seq and amplicon sequencing.

0
Entering edit mode
8.6 years ago
Martombo ★ 3.0k

In the manual, it states that the -e option should be set to the average fragment length of the library minus the average read length.

0
Entering edit mode

Yes I already did that, but usually it is used for RNA-Seq and ChIP-Seq studies and I did not find anything on exome seq studies. I am currently using the same command listed as I know my fragment length is 100 bp and assuming my library avg. fragment length around 350 , since for illumina hi seq usually the library is between 250-500 bp. lets see what the results come up. Thanks.

0
Entering edit mode

You can probably tune this parameter to have a different resolution for the counts. if you want to have a higher resolution you can decrease the value, though increasing the file size.