Question: generating simulated RNA-Seq data by using flux simulation
0
gravatar for stackunderflow
4.6 years ago by
Turkey
stackunderflow0 wrote:

I have tried to use flux simulation tool to generate simulated RNA-seq data.

I gave the following parameter file to flux-simulation shell script

## File locations

REF_FILE_NAME   cElegansAnnotation.gtf
GEN_DIR         chromFa

## Library preparation
# Expression
NB_MOLECULES    5000000
TSS_MEAN    50
POLYA_SCALE     100
POLYA_SHAPE     1.5

# Fragmentation
FRAG_SUBSTRATE  RNA
FRAG_METHOD UR
FRAG_UR_ETA     350

# Reverse Transcription
RTRANSCRIPTION  YES
RT_MOTIF    default

# Amplification 
PCR_DISTRIBUTION default
GC_MEAN      NaN
PCR_PROBABILITY  0.05

# Size Filtering
FILTERING   NO


## Sequencing
READ_NUMBER 1000000
READ_LENGTH 100
PAIRED_END  YES

# create a fastq file
FASTA           YES

According to this parameter, flux-simulation should have given reads with length 100. However when I look at the output, I have seen that reads with less then length of 100 are also included in my fasta file:

>chrI:47472-49416W:Y48G1C.12:3:651:351:594:A/1
CGTCGAAATTAGTGATATTTTTATCGGGAATCGGTCCGTGTGGTTCTCCGGTGAATATTCGATTCGTTGTGGAGACACGAGATCGCTGGGGTCCAAGGAC
>chrI:47472-49416W:Y48G1C.12:4:651:394:467:S/1
TACGCGACAAAAATGGGAAACCGAATCGCGTTTTTTGGCTTCAAGTACAAGTTATTCAGAATCATCAAAATGGG
>chrI:47472-49416W:Y48G1C.12:4:651:394:467:A/2
CCCATTTTGATGATTCTGAATAACTTGTACTTGAAGCCAAAAAACGCGATTCGGTTTCCCATTTTTGTCGCGTA
>chrI:47472-49416W:Y48G1C.12:5:651:126:142:S/2
AGTTGTAAAAGCGGATT

So, is there anyone who can help me to fix it?

sequencing rna-seq • 1.8k views
ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by stackunderflow0
1

Have you compared your settings to one of the examples? Presumably the RT_MIN setting would prevent what you're seeing.

ADD REPLYlink written 4.6 years ago by Devon Ryan93k

Thans for reply, Yes I prepared my parameter file based on the link you provided. I do not give any RT_MIN to my file because its default value looks like 500 in the link?Isn't it

ADD REPLYlink written 4.6 years ago by stackunderflow0

It's unclear whether that's the default or the value they specified. In either case, give it a try.

ADD REPLYlink written 4.6 years ago by Devon Ryan93k

Hi stack,

I hope you have finished this thread. But now i have started. Yes am using flux to simulate illumina RNA-seq paired-end data. Can you tell me how can i separate left and Right ends? and which one is sense and Antisense A/1 or S/2?. and i am also getting some reads which are below 35nt. will you solve.

ADD REPLYlink written 18 months ago by k.kathirvel93210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 759 users visited in the last hour