tophat 2 rna seq
5.1 years ago
dimitrischat ▴ 180

hi all. i have a fastq file with total sequences 28020920 but i only want to do 10.000 f.e. . which option is there because i cant seem to find that in the manual.

Not sure why you want to do that but you could use reformat.sh from BBMap suite to sample the 10000 reads into a new file and then use that.

reformat.sh in=reads.fq.gz out=sampled.fq.gz sample=10000

thank you. but cant you do it using tophat?

You can check the manual but I don't think tophat has an option to sample a fraction of reads.

Hi dimitrischat,

It's worth noting that TopHat2 has been, essentially, deprecated by the developers, who recommend using HISAT2 instead. Unless you have a very specific reason to adopt TopHat2 in your pipeline, it's probably best to follow their advice.

4.9 years ago
Oskar ▴ 20

Hi Dimitris - I guess you want a small sample of your reads for testing and debugging purposes. If so, you can create a file containing a small number of reads. For example:

which takes the first 100k reads from “myreads” and stores them in “Test100k”

Hope it helps!