Genome index problem
0
0
Entering edit mode
2.1 years ago
DanK ▴ 10

Hi,

I have E. Coli genome and in a dataset, that there are all the probed TSSs for this genome.

I want to get 400nts downstream and 100nts upstream from each TSS. How I must handle the out of bounds genome positions, for example if I have the first TSS at +50nts?

I must take the sequence from the end of the genome in this case?

Thank you in advance!

E.coli positions R Genome • 899 views
ADD COMMENT
1
Entering edit mode

For what purpose do you want these sequences? If you want to examine 100 nt in front of each TSS, but as you mention, there is a TSS at a genomic coordinate of 50, then yes, for that TSS you can only take 50 nt. What difference does it make if some of your sequences are smaller than your chosen size because they may be close to the chromosome end?

On the other hand, E. coli is a circular genome, so why not take all the sequence you need for the TSS near the genomic start coordinate (genome length - 50 to + 50)?

ADD REPLY
0
Entering edit mode

Thank you for the reply!

I want to make an alignment and generate motifs and webLogo plots. If my sequences does not have the same length, it will be a problem for WebLogo, right?

So, I think that the biologically right is to take the rest nucleotides from the end or start, respectively.

ADD REPLY
0
Entering edit mode

Hi DanK, why did you delete the post?

ADD REPLY
0
Entering edit mode

Hi, because I have found the answer and because of nobody replied

ADD REPLY
0
Entering edit mode

Please add the answer you found as an answer here and accept it - that's the way professional/scientific forums work. Someone else may run into the same problem you had and your solution may be helpful to them. Imagine if everyone on a site like StackOverflow decided to delete their question because they found a solution - a lot of knowledge would not be available to the larger community and many would be at a loss.

ADD REPLY

Login before adding your answer.

Traffic: 1393 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6