keep read association with mpileup
0
0
Entering edit mode
4 months ago
gernophil ▴ 80

Hey,

is there a way to make all bases that are from the same read in one vertical line using mpileup? So making this

chr2    96  C   5   .,.,,
chr2    97  A   4   .,.,
chr2    98  C   6   .,.,,,
chr2    99  C   8   .,.,,,..
chr2    00  A   9   .,,,..,,.
chr2    01  C   5   ..,,.
chr2    02  C   5   ,.,,,
chr2    03  T   4   .,,,

to this (made up example)


chr2    96  C   5     .,.,,  
chr2    97  A   4     .,. ,  
chr2    98  C   6   .,.,, ,  
chr2    99  C   8   .,.,,,.. 
chr2    00  A   9   .,,,..,,.
chr2    01  C   5   ..,    ,.
chr2    02  C   5   ,.,    ,,
chr2    03  T   4   .,,    ,

To make it look a little bit like in IGV? I know this would be impossible for large regions, but I am just looking at very tiny region up to 20 or max 30 bp.

EDIT: To make it more clear, let's assume this is our reference sequence:

CGATGCTAGC

And these are our NGS reads:

CGATG
GATGCT
AXGCX
GCTAG
TAGC

These should be mapped like this

CGATGCTAGC

CGATG
 GATGCT
  AXGCX
    GCTAG
      TAGC

and default mpileup would look like this:

1   C   .
2   G   ..
3   A   ...
4   T   ..X
5   G   ....
6   C   ...
7   T   .X..
8   A   ..
9   G   ..
10  C   .

Here you cannot see that the both mismatches (X) are from the same read, so that's why I want to make it look like this:

1   C   .
2   G   ..
3   A   ...
4   T   ..X
5   G   ....
6   C    ...
7   T    .X..
8   A      ..
9   G      ..
10  C       .

Here you can see that the both mismatches are from the same read, because they are in the same possition in the mpileup string.

mpileup samtools • 468 views
ADD COMMENT
0
Entering edit mode

Can you explain the changes you've made in your made up example? I cannot understand your requirement.

ADD REPLY
0
Entering edit mode

I’ll try :). I’ll introduced some spaces so that the bases that belong to the same read are always in the same position in the mpileup string, if a read does not contain this position this would be space. I want to get an overview, of the general sequencing quality in one position so if there’s a rare mutation I can check, if the whole read is bad or, if this is really likely a mutation with very low AF. I’m on my mobile right now and I will try to make a figure to illustrate this better.

ADD REPLY
0
Entering edit mode

Ok, I added some text and I hope that makes it more clear :).

ADD REPLY
1
Entering edit mode

Thank you, your requirement is clear now. I don't know much in this domain, but I have a feeling that you may have to write a custom script, which is not a big deal. You'll merely need to create a matrix from the read-based pileup, replace matches with dots and mismatches with X/,/whatever, then transpose the matrix. I'll try and create an R example.

EDIT: The script is more challenging that I thought it would be, especially given the fact that I start with the alignment you show in your example and not your actual starting point. Please wait for others to weigh in, they may have better options for you.

ADD REPLY

Login before adding your answer.

Traffic: 3683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6