Question

Detail with minimap2 output

0

Entering edit mode

4.8 years ago

Rogerio Ribeiro ▴ 110

Hi Biostars.

I have a small question I hope someone could elucidate me on. I have 12 samples of ONT from my species and I'm going to use them to improve the annotation of the genome.

I concatenated all the reads into a big file (all_reads.fastq) and I´m currently doing an alignment with minimap2. My code is as follows

minimap2 -k 14 -I 1000G -d cro_v2_asm.mmi cro_v2_asm.fasta minimap2 -t
8 -ax splice cro_v2_asm.mmi all_reads_nano.fastq > all_reads.sam

I have used minimap before, but only to align single files or to compare transcripts databases. My output is like so:

[WARNING] Indexing parameters (-k, -w or -H) overridden by parameters used in the prebuilt index.
[M::main::0.990*1.00] loaded/built the index for 2090 target sequence(s)
[M::mm_mapopt_update::1.323*1.00] mid_occ = 477
[M::mm_idx_stat] kmer size: 14; skip: 10; is_hpc: 0; #seq: 2090
[M::mm_idx_stat::1.517*1.00] distinct minimizers: 21032401 (46.44% are singletons); average occurrences: 4.535; average spacing: 5.673

[M::worker_pipeline::667.809*7.91] mapped 926625 sequences
[M::worker_pipeline::1358.631*7.93] mapped 1054148 sequences
[M::worker_pipeline::1946.186*7.93] mapped 979868 sequences
[M::worker_pipeline::2521.346*7.94] mapped 990987 sequences
[M::worker_pipeline::3107.039*7.94] mapped 953722 sequences
[M::worker_pipeline::3740.257*7.94] mapped 976724 sequences
[M::worker_pipeline::4417.527*7.94] mapped 1133642 sequences
[M::worker_pipeline::5062.460*7.94] mapped 1034305 sequences
[M::worker_pipeline::5811.450*7.94] mapped 1164408 sequences
[M::worker_pipeline::6558.900*7.94] mapped 1139990 sequences
[M::worker_pipeline::6861.026*7.94] mapped 477750 sequences
[M::main] Version: 2.15-r905
[M::main] CMD: minimap2 -t 8 -ax splice genome_illumina_annot/cro_v2_asm.mmi 01_filtering/after_trim/all_reads_nano.fastq
[M::main] Real time: 6861.119 sec; CPU: 54470.448 sec; Peak RSS: 8.451 GB

As you can see it seems that minimap2 is mapping fractions of the input file at a time, probably due to memory.

I was wondering if there is any nuances or changes in the output file, or I can freely process it with samtools and assemble the transcriptome using stringtie2.

Cheers

minimap2 RNA-seq Nanopore • 3.8k views

ADD COMMENT • link 4.8 years ago by Rogerio Ribeiro ▴ 110

0

Entering edit mode

There should be no changes in output file. You can process the output normally.

ADD REPLY • link 4.8 years ago by GenoMax 152k