Question: Bam-readcount output empty from certain regions (RNA seq bams)
0
gravatar for nils.rudqvist
5 months ago by
nils.rudqvist20 wrote:

Hi all,

I am using bam-readcount to find reads in a RNA seq bam at the location of variants found using RNA seq. However, I am not able to get the right output from all regions of my bam-file. For example:

I am working with mm10, and I use the following command:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam [chr:start-stop]

1) For my C>T mutation at chromosome 4:152297391 it all seem to work fine.

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:152297391-152297391
Minimum mapping quality is set to 1
WARNING: In read ST-K00128:117:HG5FKBBXX:6:1124:3336:48843: Couldn't find single-end mapping quality. Check to see if the SM tag is in BAM.
The previous warning has been emitted 1 times and will be disabled.
WARNING: In read ST-K00128:117:HG5FKBBXX:6:1124:3336:48843: Couldn't find number of mismatches. Check to see if the NM tag is in BAM.
The previous warning has been emitted 1 times and will be disabled.
4   152297391   G   7   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:7:255.00:39.71:0.00:4:3:0.35:0.00:11.71:4:0.39:100.43:0.54    T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

2) This is a location that I have 0 reads for in my RNA seq file (chr4:147058427, A>G):

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:147058427-147058427
Minimum mapping quality is set to 1
4   147058427   C   0   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

3) Chromosome 4: 123189043 has a C>T mutation, however, bam-readcount will return nothing at all on this location:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:123189043-123189043
Minimum mapping quality is set to 1

How come i sometimes get 0 reads and sometimes empty output? I want to make sure that when I have a NA output, I really do not have any reads for this location? Is this a problem with my reference fasta file?

Thanks, Nils

rna-seq • 305 views
ADD COMMENTlink modified 5 months ago by Chris Miller18k • written 5 months ago by nils.rudqvist20
2
gravatar for nils.rudqvist
5 months ago by
nils.rudqvist20 wrote:

So, I think I have figured it out by using IGV.

1) When i have reads that with good mapping quality and base quality, then I get these with bam-readcount.

2) When I have reads but these are not of good quality bam-readcount returns zeros.

3) When I do not have any reads at all for a certain location, bam-readcount returns nothing.

Is this a correct interpretation?

Thanks, Nils

ADD COMMENTlink modified 5 months ago • written 5 months ago by nils.rudqvist20
0
gravatar for Chris Miller
5 months ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

Yes, if there is no coverage of your variant, then bam-readcount will return nothing for that site. This includes sites for which there are no reads that meet your filtering criteria.

ADD COMMENTlink written 5 months ago by Chris Miller18k

Hi, thanks for your response.

Perhaps I missunderstand you when you say bam-readcount will return nothing for sites that do not pass my filtering criteria. How about this site? No reads here match my filtering criteria, but I still get 0s:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:147058427-147058427
Minimum mapping quality is set to 1
4   147058427   C   0   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00
ADD REPLYlink modified 5 months ago • written 5 months ago by nils.rudqvist20

The details are fuzzy in my mind at the moment, but I suspect there is a difference between sites with coverage, where all reads fail quality checks (0) vs sites without any coverage (not output).

ADD REPLYlink written 5 months ago by Chris Miller18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 908 users visited in the last hour