Question

Run metrics of Oxford Nanopore reads and visualisation of alignment

0

Entering edit mode

4 months ago

giulia.trauzzi ▴ 10

Hi everyone,

I started working with the Nanopore reads from the PromethION. The data I am working on is the output of the Lambda Control Run. I have a total of 70GB of data in fastq.

I have aligned my data with the lambda genome with minimap2, then I created my bam file, sorted it and indexed with samtools. I would like to visualise this alignment but I am a bit lost. I generated the index .bai and I am planning to use IGV locally, but I am finding it very difficult to download my 70GB of data everytime to visualise it. Is there any other way to do so?

I have also used samtools tview to see the alignment but I am struggling to understand the output (ex of output below).

    1                            11                              21
GGGC*G*G**C****G*****A***C***C***T****C**G*C***G**G*G*T****T*T***T***C****G***C*
    .... . .  .    .     .   .   .   .    .  . .   .  . . .    . .   .   .    .   .
    ....*.*.**.****.*****.***.***.***.****.**.*.***.**.*.*.****.*.***.***.****.***.*
    ....*.*.**.****.*****.***.***.***.****G**.*.***.**.*.*.****.*.***.***.****.***.*

I have also tried to output depth and coverage of my alignment but samtools depth does not work. I am focussing on coverage and I used bedtools (Below my line)

genomeCoverageBed -ibam ../samtools/align_lam_sorted.bam > coverage.txt

I am struggling to understand again the output (below).

chrL    123190  1       48502   2.06177e-05
chrL    719814  1       48502   2.06177e-05
chrL    804868  1       48502   2.06177e-05

I am starting to think that I may be using the wrong bedtools tools. Can anyone help me with this? Even with some papers or review.

Thank you very much,

Happy New Year,

Giulia

Nanopore alignment sequencing • 505 views

ADD COMMENT • link 4 months ago by giulia.trauzzi ▴ 10

score 2 · Answer 1 · 2023-12-26

2

Entering edit mode

4 months ago

GenoMax 141k

The data I am working on is the output of the Lambda Control Run.

and

but I am finding it very difficult to download my 70GB of data everytime to visualise it.

If you only have lambda DNA then there is likely no point in trying to visualize this data in IGV in total. With 48.5 kb genome (and a 70 GB dataset) you have a monstrous amount of coverage that IGV will struggle to display. If you must see the alignment then you could downsample the alignment or use a small fraction of original reads to create a small alignment file that you will find easy to manage.

mosdepth (LINK) is the easy tool to get genome wide coverage stats from BAM files.

ADD COMMENT • link 4 months ago by GenoMax 141k

0

Entering edit mode

Hello GenoMax,

thanks for your very quick reply.

I suppose you are right. I was mostly trying to do this for learning/training purposes as my lab is planning to do WGS on human samples in the future and I had never worked with mapped data before.

That's very good advice. I subsampled my bam file based on percentage with samtools but I am struggling to create an index for it as samtools fails to visualise on IGV. Is there a way to create a index for my randomly subsampled bam file?

Thank you All the best,

Giulia

ADD REPLY • link 4 months ago by giulia.trauzzi ▴ 10