generating simulated RNA-Seq data by using flux simulation
0
0
Entering edit mode
8.9 years ago

I have tried to use flux simulation tool to generate simulated RNA-seq data.

I gave the following parameter file to flux-simulation shell script

## File locations

REF_FILE_NAME   cElegansAnnotation.gtf
GEN_DIR         chromFa

## Library preparation
# Expression
NB_MOLECULES    5000000
TSS_MEAN    50
POLYA_SCALE     100
POLYA_SHAPE     1.5

# Fragmentation
FRAG_SUBSTRATE  RNA
FRAG_METHOD UR
FRAG_UR_ETA     350

# Reverse Transcription
RTRANSCRIPTION  YES
RT_MOTIF    default

# Amplification 
PCR_DISTRIBUTION default
GC_MEAN      NaN
PCR_PROBABILITY  0.05

# Size Filtering
FILTERING   NO


## Sequencing
READ_NUMBER 1000000
READ_LENGTH 100
PAIRED_END  YES

# create a fastq file
FASTA           YES

According to this parameter, flux-simulation should have given reads with length 100. However when I look at the output, I have seen that reads with less then length of 100 are also included in my fasta file:

>chrI:47472-49416W:Y48G1C.12:3:651:351:594:A/1
CGTCGAAATTAGTGATATTTTTATCGGGAATCGGTCCGTGTGGTTCTCCGGTGAATATTCGATTCGTTGTGGAGACACGAGATCGCTGGGGTCCAAGGAC
>chrI:47472-49416W:Y48G1C.12:4:651:394:467:S/1
TACGCGACAAAAATGGGAAACCGAATCGCGTTTTTTGGCTTCAAGTACAAGTTATTCAGAATCATCAAAATGGG
>chrI:47472-49416W:Y48G1C.12:4:651:394:467:A/2
CCCATTTTGATGATTCTGAATAACTTGTACTTGAAGCCAAAAAACGCGATTCGGTTTCCCATTTTTGTCGCGTA
>chrI:47472-49416W:Y48G1C.12:5:651:126:142:S/2
AGTTGTAAAAGCGGATT

So, is there anyone who can help me to fix it?

RNA-Seq sequencing • 2.7k views
ADD COMMENT
1
Entering edit mode

Have you compared your settings to one of the examples? Presumably the RT_MIN setting would prevent what you're seeing.

ADD REPLY
0
Entering edit mode

Thans for reply, Yes I prepared my parameter file based on the link you provided. I do not give any RT_MIN to my file because its default value looks like 500 in the link?Isn't it

ADD REPLY
0
Entering edit mode

It's unclear whether that's the default or the value they specified. In either case, give it a try.

ADD REPLY
0
Entering edit mode

Hi stack,

I hope you have finished this thread. But now i have started. Yes am using flux to simulate illumina RNA-seq paired-end data. Can you tell me how can i separate left and Right ends? and which one is sense and Antisense A/1 or S/2?. and i am also getting some reads which are below 35nt. will you solve.

ADD REPLY

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6