Question: Long DNA reads aligned
1
gravatar for tarek.mohamed
5 months ago by
tarek.mohamed190
tarek.mohamed190 wrote:

Hi

What are the best long reads aligners? I have dna sequencing data generated by minion. Which aligners should I use, I need to detect SNPs and indels downstream

Thanks

nanopore • 259 views
ADD COMMENTlink modified 5 months ago by WouterDeCoster35k • written 5 months ago by tarek.mohamed190
2
gravatar for WouterDeCoster
5 months ago by
Belgium
WouterDeCoster35k wrote:

Minimap2 is the currently recommended aligner for long read DNA and cDNA/RNA sequencing. If you are specifically interested in large structural variation you should also try ngmlr.

ADD COMMENTlink written 5 months ago by WouterDeCoster35k

I used Minimap2 to align my minion reads to a reference genome. I then tried to sort it using picard but I got an error. Why there is no sequence for this read? $ java -jar /Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar SortSam I=BC01_minimap.sam O=BC01_aln_sorted.bam SORT_ORDER=coordinate

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533

Here is the complete error message

20:45:39.629 INFO  NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/Users/tarekmagdyshehatamohamed/miniconda3/envs/bioinfo/share/picard-2.17.0-0/picard.jar!/com/intel/gkl/native/libgkl_compression.dylib
[Tue Jul 17 20:45:39 CDT 2018] SortSam INPUT=BC01_minimap.sam OUTPUT=BC01_aln_sorted.bam SORT_ORDER=coordinate    VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Tue Jul 17 20:45:39 CDT 2018] Executing as tarekmagdyshehatamohamed@FSMD25VN00GJ1GN on Mac OS X 10.13.5 x86_64; OpenJDK 64-Bit Server VM 1.8.0_121-b15; Deflater: Intel; Inflater: Intel; Picard version: 2.17.0-SNAPSHOT
[Tue Jul 17 20:45:39 CDT 2018] picard.sam.SortSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=257425408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. CIGAR covers 442 bases but the sequence is 0 read bases ; File BC01_minimap.sam; Line 518
Line: 726b2822-4c7f-4812-8e6f-e21f4629d92b  272 chr17   1661134 0   6S4M1D75M1D15M1I29M2D12M1I8M1D5M2I13M1I11M2I71M1D11M1D81M1D29M2D28M1I13M1I8M1I2M3D1M1D10M   *   0   0   *   *   NM:i:33 ms:i:678    AS:i:678    nn:i:0  tp:A:S  cm:i:31 s1:i:265    dv:f:0.0533
    at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:457)
    at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:355)
    at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:255)
    at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:228)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
    at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
    at picard.sam.SortSam.doWork(SortSam.java:100)
    at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
    at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:98)
    at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:108)
ADD REPLYlink modified 4 months ago • written 4 months ago by tarek.mohamed190
1

That read (flag 272) is not a primary alignment, therefore the sequence is not reported (twice).

Just use samtools for sorting, and you can do it simultaneously with alignment and avoid intermediate files:

minimap2 -t 8 -a yourgenome.fa yourreads.fastq.gz | samtools sort -@8 -o youralignment.bam

In my example I used 8 threads for alignment and sorting, which you'll have to adapt for your system.

ADD REPLYlink modified 4 months ago • written 4 months ago by WouterDeCoster35k

I am trying to call SNPs and indels, but all the tools I found are compatible with bam files generated with specific aligners. Is there a universal SNP caller that an work with minimap2. I do not want to use nanopolish since it works only with fast5 files. I am currently testing nanosv, is there any other suggestions?

ADD REPLYlink modified 4 months ago • written 4 months ago by tarek.mohamed190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1156 users visited in the last hour