Question: how to make cufflinks produce the nuclotide sequence of the gene + 5000 upstream necloutides
0
gravatar for Michel Edwar
3.7 years ago by
Michel Edwar50
Sweden
Michel Edwar50 wrote:

Hello,

I have a bam file of a chromosome, I want to use cufflinks or something similar from the terminal to get the actual genes on the chromosome complete with 5000 nucleotide upstream so i can check for promoter regions TF binding sites. I already use the following but it only gives me the transcripts of the genes.

cufflinks  --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 file1.bam 

gtf2bed --do-not-sort < transcripts.gtf > transcripts.bed

bedtools getfasta  -fi chr.fa  -bed transcripts.bed -fo results.fasta 

Regards

 

bam promoter cufflinks upstream • 1.2k views
ADD COMMENTlink modified 3.7 years ago by Manvendra Singh2.0k • written 3.7 years ago by Michel Edwar50
0
gravatar for Manvendra Singh
3.7 years ago by
Manvendra Singh2.0k
Berlin, Germany
Manvendra Singh2.0k wrote:

I would not run Cufflinks for this; I would do following

modify your gtf file as ( This simple example, you can include other coloumns as well

awk '{ if ($7=="+") 
           print $1,$4-5000,$5;
       else 
           print $1, $4,$5+5000;
}' OFS="\t" your_converted_bed_file > your_modified_bedfile

 

now either you convert your modified file into gtf (just replace 4th and 5th coloumns with 2nd and third from this file, and run featureCounts or you can run bamcov by providing the converted bed file as input.

after this you would need to normalize the data with total number of mappable reads.

 

hth

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Manvendra Singh2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1087 users visited in the last hour