Question: how to make cufflinks produce the nuclotide sequence of the gene + 5000 upstream necloutides
gravatar for Michel Edwar
3.9 years ago by
Michel Edwar50
Michel Edwar50 wrote:


I have a bam file of a chromosome, I want to use cufflinks or something similar from the terminal to get the actual genes on the chromosome complete with 5000 nucleotide upstream so i can check for promoter regions TF binding sites. I already use the following but it only gives me the transcripts of the genes.

cufflinks  --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 file1.bam 

gtf2bed --do-not-sort < transcripts.gtf > transcripts.bed

bedtools getfasta  -fi chr.fa  -bed transcripts.bed -fo results.fasta 



bam promoter cufflinks upstream • 1.3k views
ADD COMMENTlink modified 3.9 years ago by Manvendra Singh2.0k • written 3.9 years ago by Michel Edwar50
gravatar for Manvendra Singh
3.9 years ago by
Manvendra Singh2.0k
Berlin, Germany
Manvendra Singh2.0k wrote:

I would not run Cufflinks for this; I would do following

modify your gtf file as ( This simple example, you can include other coloumns as well

awk '{ if ($7=="+") 
           print $1,$4-5000,$5;
           print $1, $4,$5+5000;
}' OFS="\t" your_converted_bed_file > your_modified_bedfile


now either you convert your modified file into gtf (just replace 4th and 5th coloumns with 2nd and third from this file, and run featureCounts or you can run bamcov by providing the converted bed file as input.

after this you would need to normalize the data with total number of mappable reads.



ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by Manvendra Singh2.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2138 users visited in the last hour