Entering edit mode
                    7.5 years ago
        mra8187
        
    
        ▴
    
    20
    Dear all i have done RNA-seq project and have some question about cufflinks and other related program that links to cufflinks like cuffmerge and ....
after using Cufflinks package we get this document : cds.diff gene expression.diff and ... that contain this column :
test_id,    gene_id,    gene,   locus,  sample_1,   sample_2,   status, value_1,    value_2,    log2(fold_change),  test_stat,  p_value,    q_value,    significant,
XLOC_000302,    XLOC_000302,    -,  1:9748739-9749918,  D,  Q,  ,OK,    1.35346,    25.6511,    4.2443  ,4.96161,   5.00E-05,   0.000162672,    yes,
my question is : how i can find sequence of this differential expression gene ?
sequence of this genes is really important to me
thanks all
mohamadreza
See the section on "Extracting transcript sequences" here.
in this scrip : gffread -w transcripts.fa -g /path/to/genome.fa transcripts.gtf
transcripts.fa : my raw RNA-seq data ?
transcripts.gtf : gtf file that i download from internet or file that i get from cuflinks ?
and how can i get exit file ?
thanks for your answers
-w filenameis output file with spliced exons for each transcript.transcripts.gtfis the file that has theXLOC idyou are interested in. If you only want one XLOC id you could make a subset file.reference_sequence.fa - reference sequence in fasta format. Index the genome sequence before you proceed. Example code:
$ samtools faidx reference_sequence.fa
try in linux: