Have anybody converted exome bam files to .tdf files for visualization in IGV?
2
1
Entering edit mode
8.6 years ago
ivivek_ngs ★ 5.2k

Dear All,

I am trying to view my exome bam files in IGV, it seems the bam files are too huge (20 GB) each so this is a big constraint to get them viewed in IGV. I got to know that for visualization of large datasets it is useful to convert them to count format or preferably to .tdf format. I have seen the documents and it says this is usually used in case of RNA-Seq and CHIP-Seq. Now the want to use it for exome-seq as well. My data is paired end and with 100bp reads. The coverage is 70X for the samples. How to change the -e parameter for generating the .tdf file for my samples

igvtools count -z 5 -w 25 -e 250 input.bam out.bam.tdf hg19

Can anyone give me suggestions?

sequencing alignment next-gen SNP • 6.8k views
ADD COMMENT
0
Entering edit mode

Have you tried just sorting and indexing? There's usually no need to make a tdf file.

ADD REPLY
1
Entering edit mode
5.6 years ago
predeus ★ 1.8k

Basically -e adds extra coverage to your reads, which is annoying and can be misleading. If the sequencing is paired-end, you'll see both reads, so there's no real reason to extend the coverage past what's actually seen in the reads.

I would say just set it to 0 for all applications.

Go ahead and make three TDF files, with -e of 0, 100, and 200, you'll see what I'm talking about.

What's more important/tricky is marking duplicates before you make the TDF. You should mark them (with Picard MarkDuplicates, not samtools) for WES, WGS, and ChIP-seq, and should NOT mark them for RNA-seq and amplicon sequencing.

ADD COMMENT
0
Entering edit mode
8.6 years ago
Martombo ★ 3.0k

In the manual, it states that the -e option should be set to the average fragment length of the library minus the average read length.

ADD COMMENT
0
Entering edit mode

Yes I already did that, but usually it is used for RNA-Seq and ChIP-Seq studies and I did not find anything on exome seq studies. I am currently using the same command listed as I know my fragment length is 100 bp and assuming my library avg. fragment length around 350 , since for illumina hi seq usually the library is between 250-500 bp. lets see what the results come up. Thanks.

ADD REPLY
0
Entering edit mode

You can probably tune this parameter to have a different resolution for the counts. if you want to have a higher resolution you can decrease the value, though increasing the file size.

ADD REPLY

Login before adding your answer.

Traffic: 970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6