Hi, I would like to ask if you can recommend me tools that I can use to simulate whole genome sequences using a reference genome which will also give me list of the variants just like wgsim?
I was able to use wgsim, but when I tried aligning the paired-reads, it's not aligning together.
reads=100000000
for i in $(seq 1 10); do
base="sim_${i}"
seed=$((100 + i)) # different seed for each run (arbitrary choice)
echo "Running $base (seed=$seed)..."
"$wgsim" "$ref" "${base}.R1.fq" "${base}.R2.fq" -1 70 -2 70 -N $reads -S $seed -e 0.0001 > "${base}.out.log" 2>&1
echo "$base finished (log: ${base}.out.log)"
done
of maybe I am using wgsim wrong? I hope you can help me. Thank you!
It's a little unclear what you mean by "I was able to use wgsim, but when I tried aligning the paired-reads, it's not aligning together."
Sorry about that. It's when I align the paired reads together like the left and righ, they should align with common sequence at some ends, making one longer consensus sequence. I was able to align the pairs with an actual short read pairs.