How to generate paired-end FASTQ files from a reference sequence?
1
0
Entering edit mode
2.1 years ago

I have a reference sequence NC_008253.fa. I want to obtain 2 paired end FASTQ files.

maq.pl easyrun -d outdir NC_008253.fa out1.fq out2.fq

Traceback:

-- CMD: /usr/bin/maq fasta2bfa /home/melissachua/NC_008253.fa outdir/ref.bfa 2> /dev/null
read() on closed filehandle $fh at /usr/bin/maq.pl line 833.
** Cannot guess the format of file '/home/melissachua/out1.fq'. at /usr/bin/maq.pl line 107.

Data:

>gi|110640213|ref|NC_008253.1| Escherichia coli 536, complete genome
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
TTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGACTTAGGTCACTAAATACTTTAACCAA
TATAGGCATAGCGCACAGACAGATAAAAATTACAGAGTACACAACATCCATGAAACGCATTAGCACCACC
ATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAG
CCCGCACCTGACAGTGCGGGCTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAG
TTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGGGTTGCCGATATTCTGGAAAGCAATGCCA
GGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCATCTGGTAGCGATGATTGA
AAAAACCATTAGCGGTCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTCTG
ACGGGACTCGCCGCCGCCCAGCCGGGATTTCCGCTGGCACAATTGAAAACTTTCGTCGACCAGGAATTTG
CCCAAATAAAACATGTCCTGCATGGCATCAGTTTGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCT
GATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTGTTAGAAGCGCGTGGTCACAACGTT
ACCGTTATCGATCCGGTCGAAAAACTGCTGGCAGTGGGTCATTACCTCGAATCTACCGTTGATATTGCTG
AATCCACCCGCCGTATTGCGGCAAGCCGCATTCCGGCTGACCACATGGTGCTGATGGCTGGTTTCACTGC
CGGTAATGAAAAAGGCGAGCTGGTGGTTCTGGGACGCAACGGTTCCGACTACTCCGCTGCGGTGCTGGCG
GCCTGTTTACGCGCCGATTGTTGCGAGATCTGGACGGATGTTGACGGTGTTTATACCTGCGATCCGCGTC
AGGTGCCCGATGCGAGGTTGTTGAAGTCGATGTCCTATCAGGAAGCGATGGAGCTTTCTTACTTCGGCGC
TAAAGTTCTTCACCCCCGCACCATTACCCCCATCGCCCAGTTCCAGATCCCTTGCCTGATTAAAAATACC
GGAAATCCCCAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGGGCA
TTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCCGGCCCGGGGATGAAAGGGATGGTTGGCATGGC
GGCGCGCGTCTTTGCAGCGATGTCACGCGCCCGTATTTCCGTGGTGCTGATTACGCAATCATCTTCCGAA
TACAGTATCAGTTTCTGCGTTCCGCAAAGCGACTGTGTGCGAGCTGAACGGGCAATGCAGGAAGAGTTCT
ACCTGGAACTGAAAGAAGGCTTACTGGAGCCGTTGGCGGTGACGGAACGGCTGGCCATTATCTCGGTGGT
AGGTGATGGTATGCGCACCTTACGTGGGATCTCGGCGAAATTCTTTGCCGCGCTGGCCCGCGCCAATATC
AACATTGTCGCCATTGCTCAGGGATCTTCTGAACGCTCAATCTCTGTCGTGGTCAATAACGATGATGCGA
CCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGG
CGTCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAAAGCTGGTTGAAGAATAAACAT
ATCGACTTACGTGTCTGCGGTGTTGCTAACTCGAAGGCACTGCTCACCAATGTACATGGCCTTAATCTGG
AAAACTGGCAGGAAGAACTGGCGCAAGCCAAAGAGCCGTTTAATCTCGGGCGCTTAATTCGCCTCGTGAA
MAQ • 820 views
ADD COMMENT
0
Entering edit mode

use wgsim in samtools https://github.com/samtools/samtools

on a side note , nobody uses maq anymore.

ADD REPLY
0
Entering edit mode

How do I use wgsim? I tried:

wgsim NC_008253.fa out1.fq out2.fq /dev/null

but not sure if I did it correctly. Also, should I be using the index file NC_008253.fa.fai instead?

ADD REPLY
0
Entering edit mode

but not sure if I did it correctly.

Usage:   wgsim [options] <in.ref.fa> <out.read1.fq> <out.read2.fq>

Options: -e FLOAT      base error rate [0.020]
         -d INT        outer distance between the two ends [500]
         -s INT        standard deviation [50]
         -N INT        number of read pairs [1000000]
         -1 INT        length of the first read [70]
         -2 INT        length of the second read [70]
         -r FLOAT      rate of mutations [0.0010]
         -R FLOAT      fraction of indels [0.15]
         -X FLOAT      probability an indel is extended [0.30]
         -S INT        seed for random generator [0, use the current time]
         -A FLOAT      discard if the fraction of ambiguous bases higher than FLOAT [0.05]
         -h            haplotype mode
ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
2.1 years ago
GenoMax 141k

As noted elsewhere you should use a newer/current tool to simulate these reads. Besides ART (detailed here: How do I install MAQ software on Ubuntu? ) you can use randomreads.sh from BBMap suite (see paired-end reads in BBMap randomreads.sh )

ADD COMMENT

Login before adding your answer.

Traffic: 2810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6