how to make cufflinks produce the nuclotide sequence of the gene + 5000 upstream necloutides
1
0
Entering edit mode
6.7 years ago
Michel Edwar ▴ 70

Hello,

I have a bam file of a chromosome, I want to use cufflinks or something similar from the terminal to get the actual genes on the chromosome complete with 5000 nucleotide upstream so i can check for promoter regions TF binding sites. I already use the following but it only gives me the transcripts of the genes.

cufflinks  --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 file1.bam

gtf2bed --do-not-sort < transcripts.gtf > transcripts.bed

bedtools getfasta  -fi chr.fa  -bed transcripts.bed -fo results.fasta

Regards

bam cufflinks upstream promoter • 1.8k views
0
Entering edit mode
6.7 years ago
Manvendra Singh ★ 2.2k

I would not run Cufflinks for this; I would do following

modify your gtf file as ( This simple example, you can include other coloumns as well

awk '{ if ($7=="+") print$1,$4-5000,$5;
else
print $1,$4,\$5+5000;
}' OFS="\t" your_converted_bed_file > your_modified_bedfile


now either you convert your modified file into gtf (just replace 4th and 5th coloumns with 2nd and third from this file, and run featureCounts or you can run bamcov by providing the converted bed file as input.

after this you would need to normalize the data with total number of mappable reads.

hth