I have tried to use
flux simulation tool to generate simulated RNA-seq data.
I gave the following parameter file to flux-simulation shell script
## File locations REF_FILE_NAME cElegansAnnotation.gtf GEN_DIR chromFa ## Library preparation # Expression NB_MOLECULES 5000000 TSS_MEAN 50 POLYA_SCALE 100 POLYA_SHAPE 1.5 # Fragmentation FRAG_SUBSTRATE RNA FRAG_METHOD UR FRAG_UR_ETA 350 # Reverse Transcription RTRANSCRIPTION YES RT_MOTIF default # Amplification PCR_DISTRIBUTION default GC_MEAN NaN PCR_PROBABILITY 0.05 # Size Filtering FILTERING NO ## Sequencing READ_NUMBER 1000000 READ_LENGTH 100 PAIRED_END YES # create a fastq file FASTA YES
According to this parameter, flux-simulation should have given reads with length 100. However when I look at the output, I have seen that reads with less then length of 100 are also included in my fasta file:
>chrI:47472-49416W:Y48G1C.12:3:651:351:594:A/1 CGTCGAAATTAGTGATATTTTTATCGGGAATCGGTCCGTGTGGTTCTCCGGTGAATATTCGATTCGTTGTGGAGACACGAGATCGCTGGGGTCCAAGGAC >chrI:47472-49416W:Y48G1C.12:4:651:394:467:S/1 TACGCGACAAAAATGGGAAACCGAATCGCGTTTTTTGGCTTCAAGTACAAGTTATTCAGAATCATCAAAATGGG >chrI:47472-49416W:Y48G1C.12:4:651:394:467:A/2 CCCATTTTGATGATTCTGAATAACTTGTACTTGAAGCCAAAAAACGCGATTCGGTTTCCCATTTTTGTCGCGTA >chrI:47472-49416W:Y48G1C.12:5:651:126:142:S/2 AGTTGTAAAAGCGGATT
So, is there anyone who can help me to fix it?