Question: How minimap2 mapping works
21 days ago
SUDOsundu wrote:


I mapped the trimmed nanopore reads with a single reference fasta file. I got more coverage depth. Then I map the same reads with a file which contains multiple fasta files including the sequence which I mapped previously. When I visualized the BAM in UGENE, Tablet, I found more depth in first BAM file rather than the second. Why the coverage is different between the two BAM?

written 21 days ago by SUDOsundu
21 days ago
United States
GenoMax wrote:

If you are widening the search space then it is not surprising that overall depth of alignment went down for smaller of the two. Reads must be aligning better to additional sequences. This is reason why one should not use a reduced reference if data is from whole genome.

written 21 days ago by GenoMax

Thanks I am aligning with plant virus genome. It it is a multi component eg. 3 fasta sequences DNA A, B, C. 4 to 5 kb size of each genome. It also has common regions among them. Should I align individually?

written 21 days ago by SUDOsundu

Hi, if you post a question you should most importantly give details to understand your experiment. It is unclear what you even sequenced, therefore hard to give advise. In general it is good practice to align to all sequences that your reads possible could come from.

written 21 days ago by ATpoint

Do you want reads to match to the location that they are most similar to or do you want to ensure that they only match to one organism (reference?). Choose depending on the goals you have.

written 20 days ago by Istvan Albert

Sorry @ATpoint I was in a hurry. My aim is to characterize virus genome from infected plant. I sequenced the infected plant genome. It is a known virus but reference sequence is not available for the particular subgroup. It is a tripartite virus. The virus has 3 different DNA with 3 kb length eg. DNA A, DNA B DNA C. I am trying to do consensus generation by aligning with the available sequences in that subgroup. So I pulled full length available sequences (20 sequences) of the virus from ncbi and mapped.against them. From the above comments I understand that I have widened the search space, so depth for each reference went down. The three genome has common regions. If I need to get the consensus sequence, should I align with a single reference file containing the 3 fasta sequence or should I map it separately. From what I understood If I map it individually I may get more depth. Since the 3 DNA has common regions I need to map the long reads against all 3 to get the best mapping. Please correct me.

written 20 days ago by SUDOsundu
