reformat.srun won't work
2
0
Entering edit mode
3.0 years ago

I have a fastq file from ENA that has combined R1 and R2 into same file. I need to separate them. I've done it the way it says to on BBMap but this keeps happing. I've tried redownloading, tried the fastq.gz then unzipping. I don't know what to do.

(base) -bash-4.2$ head SRR10870267.fastq
@SRR10870267.1 NB500947:631:HV5VGBGX7:1:11101:7183:1061/3
GATACACTCTAGAAGAATGGATCGANAGTAAAAGCCCCCACACCTTCATCTGCAGTGG
+
//AA/6///<//6////<///////#</6//6/<6//A////E///<//<</A/EE//
@SRR10870267.2 NB500947:631:HV5VGBGX7:1:11101:16909:1061/3
GATCATCCAGGATTCAGAATACTAANAAAGATCAGTCTCCATGTCAGATGGTATCTAT
+
//AA/<<A/6//</A//////AE6/#<////////<<<A//</E////EAEA/EA/E<
@SRR10870267.3 NB500947:631:HV5VGBGX7:1:11101:7843:1062/3
CCACAGACACATCAAAGAATGAGTAACTCAACAAAGAATACGACTGTTACCTGTAAAG
(base) -bash-4.2$ mv SRR10870267.fastq SRR10870267_S1_L001_R1_001.fastq
(base) -bash-4.2$ mv SRR10870268.fastq SRR10870268_S1_L001_R1_001.fastq
(base) -bash-4.2$ ls
LoopSTARAlign.srun  Project3.log  SRR10870267_S1_L001_R1_001.fastq  STARAlign.srun      cellranger-6.0.1  decompress.srun             irradiated  noinjuryp3.srun  refdata-gex-GRCh38-2020-A  sample_script.srun
Project1_Output     RMATS.srun    SRR10870268_S1_L001_R1_001.fastq  STARTestRun_Output  decompress.log    filtered_feature_bc_matrix  miniconda3  normal           refdata-gex-mm10-2020-A
(base) -bash-4.2$ sbatch noinjuryp3.srun
Submitted batch job 2872249
(base) -bash-4.2$ less Project3.log
(base) -bash-4.2$ reformat.sh in = SRR10870268_S1_L001_R1_001.fastq  out1 = SRR10870268_S1_L001_R1_001.fastq out2 = SRR10870268_S1_L001_R2_001.fastq
java -ea -Xmx200m -cp /network/rit/lab/bioinformaticslab/BGonzalez/miniconda3/opt/bbmap-38.22-0/current/ jgi.ReformatReads in = SRR10870268_S1_L001_R1_001.fastq out1 = SRR10870268_S1_L001_R1_001.fastq out2 = SRR10870268_S1_L001_R2_001.fastq
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
        at shared.PreParser.<init>(PreParser.java:71)
        at shared.PreParser.<init>(PreParser.java:30)
        at jgi.ReformatReads.<init>(ReformatReads.java:55)
        at jgi.ReformatReads.main(ReformatReads.java:45)
(base) -bash-4.2$
BBMAP fastq reformat.srun cellranger • 787 views
ADD COMMENT
2
Entering edit mode
3.0 years ago
Mensur Dlakic ★ 27k

You are using the same name SRR10870268_S1_L001_R1_001.fastq as your in and out1 parameters. Basically, you are overwriting the file from which you are reading.

ADD COMMENT
0
Entering edit mode
3.0 years ago
GenoMax 141k

Correct command in this case should be

reformat.sh in=SRR10870268.fastq out1=SRR10870268_R1.fastq out2=SRR10870268_R2.fastq

assuming the initial file does have interleaved reads. That would be rather unusual.

EDIT: I checked the SRA accession. This is a 10x dataset so if you need to retrieve the fastq files then you are likely best off using SRA submission. Submitters have not provided the original BAM file so you will either need to check what is in the file you can download for free or pay to download the original R1,R2,I1 files that are available via cloud SRA.

ADD COMMENT

Login before adding your answer.

Traffic: 2431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6