How to specify Illumina Single End data in the MaSuRCA Assembler config file
1
1
Entering edit mode
8.7 years ago
ajaybioinfo ▴ 80

Dear All,

Hi I am new to MaSuRCA and want to perform a hybrid denovo assembly of a plant genome.

I have successfully completed the assembly on the provided test data set.

But here i Have 2 paired end run of illumina and 3 run of Illumina Single end. In the tutorial it is clearly mention how to specify the Paired - End data but it is not mention how to specify Single end Illumina data in the config file.

Can anyone one guide to specify the single end illumina data in the configuration file

Thanks a lot

next-gen Assembly • 4.9k views
0
Entering edit mode

Dear Sukhdeep,

Sorry for late reply I have tried your suggested option but it doesnt work. it is giving below error:

Command:

/root/Desktop/DesktopBKP/MaSuRCA-2.2.2/bin/./masurca 2as-config-file1-25-4.txt


Error:

missing forward file for PE library H3 at /root/Desktop/DesktopBKP/MaSuRCA-2.2.2/bin/./masurca line 305, <FILE> line 16)


My Configuration File

PARAMETERS
CA_PARAMETERS= ovlMerSize=30 cgwErrorRate=0.25 merylMemory=8192 ovlMemory=4GB
KMER_COUNT_THRESHOLD = 1
GRAPH_KMER_SIZE=auto
DO_HOMOPOLYMER_TRIM=1
JF_SIZE=50000000000
END

DATA
PE= GA 525 60 /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_GA_s_1_1_sequence.fastq /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_GA_s_1_2_sequence.fastq
PE= H1 525 60 /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_HS_s_1_1_sequence.fastq /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_HS_s_1_2_sequence.fastq
PE= H2 525 60 /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_HS_s_2_1_sequence.fastq /nrcpb1/ajay/wheat-genome2as-2014/illumina/raw-masurca/2AS_HS_s_2_2_sequence.fastq

PE= H3 525 60 /nrcpb1/ajay/wheat-genome2as-2014/illumina/s_2_2_sequence.fastq # (Illumina Single End file 1)
PE= H4 525 60 /nrcpb1/ajay/wheat-genome2as-2014/illumina/s_3_2_sequence.fastq # (Illumina Single End file 2)

OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/G73MYLX.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/G7DSIN1.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/G7QNE8J.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/G7SIGDD.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/G92D85N.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/HD4EOM.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/HDJUW5E.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/HDRDHMJ.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/HDWU88M.frg
OTHER=/nrcpb1/ajay/wheat-genome2as-2014/454-CABOG/HG1SIK2.frg

END


please tell me where I am wrong in the configuration file I am using MaSuRCA2.2.2, I have tried 2.2.1 also but getting same error

Is MaSuRCA is supporting illumina single end data?

Thanks

2
Entering edit mode
8.7 years ago

Just specifying one file should work and replacing pe by se.

DATA
END

PARAMETERS
GRAPH_KMER_SIZE= kmer_size
JF_SIZE= int          #jellyfish hash size - 10x the genome
END

0
Entering edit mode

Dear Sukhdeep

Thanks for your suggestion it worked now and i am able to get the assemble.sh file from configuration file.

one more thing I am not giving the exact path of the SE fastq file thats why it is not creating the assemble.sh script. Once again thanks for your support

Cheers

0
Entering edit mode

I don't understand your configuration. We cannot use 2 libraries of SE ?

0
Entering edit mode

This have not worked for me (version 11152016).

Also, believe that two numbers after tag of the library are not avg/std_dev _read_length but average/std dev fragment length (i.e. the distance of the pair end reads).