Wiggle Files, BAM/SAM Help
1
0
Entering edit mode
3.7 years ago

Hey, I am looking for some help with a generic problem I have been having.

I have some biological questions that require me to explore the counts per base in RNAseq data. I had a lab partner that originally generated a wiggle file use an output command for STAR aligner. But I have been trying to find a way to extract this information from the produced BAM file.

I have looked through the internet for BAM to wiggle suggestions but have yet to find one that actually makes the same wiggle file STAR outputs.

I started exploring other solutions like BAM to bedgraph with step and bin sizes of 1. And the closest I have gotten to a solution is deeptools. They have a function that allows you to access Bam files. However, I run into a similar problem where the counts are not the "same" as the wiggle generated by STAR. The counts have the same shape-ish, but the values are slightly different. And I have tried filtering on different SAM Flags and nothing has really worked.

I'm not necessarily looking for a specific solution just curious if anyone has tried doing this sort of thing before and if they could shine a light on their method. Or if anyone can maybe give a suggestion on why the wiggle counts from STAR output and the counts from pulling directly from the BAM file are slightly different (Other then some SAM flag filter in STAR that may be hidden? or loss of information in the compression fo the BAM file).

I haven't done a lot of work with BAM files before so any suggestions would be appreciated.

BAM SAM Wiggle • 1.2k views
ADD COMMENT
0
Entering edit mode
3.7 years ago
jkkbuddika ▴ 190

Hi, I have used deepTools bamCoverage function to generate .bw or .bedgraph output files using .bam inputs generated by STAR. Usually when I use this I use a command like this:

bamCoverage --normalizeUsing CPM --binSize 1 -b /path/to/input.bam --outFileFormat bigwig -o /path/to/output.bw

Assuming you are getting a similar read distribution, one reason why you might be getting different values could probably be due to the mode of normalization that you and your colleague might have used. I guess, you can try different normalization modes in deepTools RPKM/BPM (CPM is probably the best) and see whether you get any closer.

ADD COMMENT

Login before adding your answer.

Traffic: 2542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6