Question: Long DNA reads aligned
1
gravatar for tarek.mohamed
17 months ago by
tarek.mohamed260
tarek.mohamed260 wrote:

Hi

What are the best long reads aligners? I have dna sequencing data generated by minion. Which aligners should I use, I need to detect SNPs and indels downstream

Thanks

nanopore • 682 views
ADD COMMENTlink modified 17 months ago by WouterDeCoster42k • written 17 months ago by tarek.mohamed260
2
gravatar for WouterDeCoster
17 months ago by
Belgium
WouterDeCoster42k wrote:

Minimap2 is the currently recommended aligner for long read DNA and cDNA/RNA sequencing. If you are specifically interested in large structural variation you should also try ngmlr.

ADD COMMENTlink written 17 months ago by WouterDeCoster42k

I used Minimap2 to align my minion reads to a reference genome. I then tried to sort it using picard but I got an error. Why there is no sequence for this read? $ java -jar /Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar SortSam I=BC01_minimap.sam O=BC01_aln_sorted.bam SORT_ORDER=coordinate

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533

Here is the complete error message

20:45:39.629 INFO  NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Tue Jul 17 20:45:39 CDT 2018] SortSam INPUT=BC01_minimap.sam OUTPUT=BC01_aln_sorted.bam SORT_ORDER=coordinate    VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Tue Jul 17 20:45:39 CDT 2018] Executing as tarekmagdyshehatamohamed@FSMD25VN00GJ1GN on Mac OS X 10.13.5 x86_64; OpenJDK 64-Bit Server VM 1.8.0_121-b15; Deflater: Intel; Inflater: Intel; Picard version: 2.17.0-SNAPSHOT
[Tue Jul 17 20:45:39 CDT 2018] picard.sam.SortSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=257425408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533
    at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:457)
    at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:355)
    at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:255)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:228)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
    at picard.sam.SortSam.doWork(SortSam.java:100)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
ADD REPLYlink modified 17 months ago • written 17 months ago by tarek.mohamed260
1

That read (flag 272) is not a primary alignment, therefore the sequence is not reported (twice).

Just use samtools for sorting, and you can do it simultaneously with alignment and avoid intermediate files:

minimap2 -t 8 -a yourgenome.fa yourreads.fastq.gz | samtools sort -@8 -o youralignment.bam

In my example I used 8 threads for alignment and sorting, which you'll have to adapt for your system.

ADD REPLYlink modified 17 months ago • written 17 months ago by WouterDeCoster42k

I am trying to call SNPs and indels, but all the tools I found are compatible with bam files generated with specific aligners. Is there a universal SNP caller that an work with minimap2. I do not want to use nanopolish since it works only with fast5 files. I am currently testing nanosv, is there any other suggestions?

ADD REPLYlink modified 17 months ago • written 17 months ago by tarek.mohamed260

@WouterDeCoster Sorry if this sounds silly, second command "samtools sort -@8 -o youralignment.bam" doesn't seem to work. It instead says samtools sort [options] <in.bam> <out.prefix>, whereas I have input file in .sam format. I have also tried with "minimap2 -ax map-ont -L -t 8". Is there something I missed/misunderstood? Thanks!

ADD REPLYlink modified 11 months ago • written 11 months ago by zeeefa90

Make sure you have a reasonably recent version of samtools. It seems you have an older version, which does not support the syntax I described above.

ADD REPLYlink written 11 months ago by WouterDeCoster42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 756 users visited in the last hour