Trying to select a certain range of sequences in Geneious
0
0
Entering edit mode
15 months ago
enrico104 • 0

Hello everyone,

I'm currently working on Geneious v9.1. with NGS data. Due to each sample having a large amount of sequences working on them is both very time consuming and difficult to do for my PC. I'm trying so to select a limited amount of sequences starting the first one (e.g. if there are 100 million sequences I want to work only on the first 5 million one). From the Geneious manual I've read about normalization, but the result is different from what I'm trying to achieve. I've also thought about doing a "De Novo Assemble" and checking "Use X % of data", but I don't know if that's the most efficient way to do so.

Thanks in advance to everyone that'll help.

Geneious • 740 views
ADD COMMENT
0
Entering edit mode

Your best bet is to contact Geneious support for this since it is commercial software and not many here may have access.

ADD REPLY
0
Entering edit mode

I've already asked, I've been told about normalization only. So i suppose this kind of operation could / should be.done using another program. I'll try searching, but in the meantime if someone knows how to do this on Fastq files I'll be very thankfull

ADD REPLY
0
Entering edit mode

Subsampling can be done using reformat.sh from BBMap suite (command line java). To get 5 mil reads do the following:

reformat.sh -Xmx4g in=fastq.gz out=sampled.fastq.gz samplereadstarget=5000000

samplereadstarget=0     (srt) Exact number of OUTPUT reads (or pairs) desired.

Other program options:

seqtk sample and seqkit sample.

ADD REPLY

Login before adding your answer.

Traffic: 2057 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6