Entering edit mode
3 months ago
Jjbox ▴ 40
Does anyone know how to subsample read from a bam file? The below command gives the read number of this bam file. I want to get about 100,000,000 read out of 122,441,229 read.
seqtk provides similar function with fastq file like the command below. I was wondering if I can find a bam file version of seqtk.
./seqtk sample -s101 /data/long_read/lr_consoritum/pcb/ENCFF563QZR.fastq 1844630 > ENCFF563QZR_sub.fq
You can probably use BBTools for this as well (untested) :
If you have paired-end data then