Question: How minimap2 mapping works
0
gravatar for SUDOsundu
21 days ago by
SUDOsundu0
SUDOsundu0 wrote:

Hello,

I mapped the trimmed nanopore reads with a single reference fasta file. I got more coverage depth. Then I map the same reads with a file which contains multiple fasta files including the sequence which I mapped previously. When I visualized the BAM in UGENE, Tablet, I found more depth in first BAM file rather than the second. Why the coverage is different between the two BAM?

ugene minimap2 • 110 views
ADD COMMENTlink modified 21 days ago • written 21 days ago by SUDOsundu0
3
gravatar for GenoMax
21 days ago by
GenoMax95k
United States
GenoMax95k wrote:

If you are widening the search space then it is not surprising that overall depth of alignment went down for smaller of the two. Reads must be aligning better to additional sequences. This is reason why one should not use a reduced reference if data is from whole genome.

ADD COMMENTlink written 21 days ago by GenoMax95k

Thanks I am aligning with plant virus genome. It it is a multi component eg. 3 fasta sequences DNA A, B, C. 4 to 5 kb size of each genome. It also has common regions among them. Should I align individually?

ADD REPLYlink written 21 days ago by SUDOsundu0
1

Hi, if you post a question you should most importantly give details to understand your experiment. It is unclear what you even sequenced, therefore hard to give advise. In general it is good practice to align to all sequences that your reads possible could come from.

ADD REPLYlink modified 21 days ago • written 21 days ago by ATpoint44k

Do you want reads to match to the location that they are most similar to or do you want to ensure that they only match to one organism (reference?). Choose depending on the goals you have.

ADD REPLYlink written 20 days ago by Istvan Albert ♦♦ 86k

Sorry @ATpoint I was in a hurry. My aim is to characterize virus genome from infected plant. I sequenced the infected plant genome. It is a known virus but reference sequence is not available for the particular subgroup. It is a tripartite virus. The virus has 3 different DNA with 3 kb length eg. DNA A, DNA B DNA C. I am trying to do consensus generation by aligning with the available sequences in that subgroup. So I pulled full length available sequences (20 sequences) of the virus from ncbi and mapped.against them. From the above comments I understand that I have widened the search space, so depth for each reference went down. The three genome has common regions. If I need to get the consensus sequence, should I align with a single reference file containing the 3 fasta sequence or should I map it separately. From what I understood If I map it individually I may get more depth. Since the 3 DNA has common regions I need to map the long reads against all 3 to get the best mapping. Please correct me.

ADD REPLYlink written 20 days ago by SUDOsundu0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2662 users visited in the last hour
_