Entering edit mode
6.5 years ago
mra8187
▴
20
Dear all i have done RNA-seq project and have some question about cufflinks and other related program that links to cufflinks like cuffmerge and ....
after using Cufflinks package we get this document : cds.diff gene expression.diff and ... that contain this column :
test_id, gene_id, gene, locus, sample_1, sample_2, status, value_1, value_2, log2(fold_change), test_stat, p_value, q_value, significant,
XLOC_000302, XLOC_000302, -, 1:9748739-9749918, D, Q, ,OK, 1.35346, 25.6511, 4.2443 ,4.96161, 5.00E-05, 0.000162672, yes,
my question is : how i can find sequence of this differential expression gene ?
sequence of this genes is really important to me
thanks all
mohamadreza
See the section on "Extracting transcript sequences" here.
in this scrip : gffread -w transcripts.fa -g /path/to/genome.fa transcripts.gtf
transcripts.fa : my raw RNA-seq data ?
transcripts.gtf : gtf file that i download from internet or file that i get from cuflinks ?
and how can i get exit file ?
thanks for your answers
-w filename
is output file with spliced exons for each transcript.transcripts.gtf
is the file that has theXLOC id
you are interested in. If you only want one XLOC id you could make a subset file.reference_sequence.fa - reference sequence in fasta format. Index the genome sequence before you proceed. Example code:
$ samtools faidx reference_sequence.fa
try in linux: