Question

How can we know the position of the Insertions sequence (IS) or Transposon (Tn)? How to make a graph by plotting/mapping?

0

Entering edit mode

3.7 years ago

rickyalfaray ▴ 20

Hi everyone,

I was confused about how can we plot/map many of our Insertions sequences (IS) or Transposon (Tn) of mobile genetic elements (MGE). I have already detected many ISs from around 2000 bacterial WGS of H. pylori using the tool named ISESCAN and I know each of their position inside each genome. After running I got many files (.fna .faa .gff .csv) containing information of ISs and Tn for each sample strain. I would like to make a graph (any graph is ok) that can give us information about:

where is their position relatively in a reference or representative genome, so I can know what genes are mostly affected by these ISs or Tn.
what genes are affected because they are near each IS or Tn in each genome
how often (frequency) they are found in that position/gene (e.g., IS605 found near gene A in 1500 out of 2000 WGS, or if can not specify the IS type is also ok, so it will be just like: IS/Tn found near gene A in 1500 out of 2000 WGS, the other 500 WGS don't have any IS/Tn near of them)

In my plan, it should be like in the attached picture, however, what I want is with the x-axis being the position and the y-axis being the p or frequency (I got from this paper), but I don't understand very well how to make it. enter image description here

For number 1, I already finished. What I have done was first, I tried to first blast all the IS and Tn sequences against a reference genome by using NCBI blast or identic tools, then I retrieved the position of every IS and Tn relative to the reference genome. Then I made the graph by using ggplot in R. In addition, to know what genes were affected by IS or Tn, I tried to list every neighbor gene with the range 1000bp before and after each of IS or Tn. However, I'm afraid my method is not 100% correct, so I need suggestions.

I still don't know how to do aims number 2 and 3. It will be difficult to do one by one since I have more than 2000 genomes.

Could anyone please give me suggestions?

What tool(s) or step(s) looks suitable for this purpose?

Thank you.

Sincerely,

Ricky

sequence mapping genome graph plot • 616 views

ADD COMMENT • link 3.7 years ago by rickyalfaray ▴ 20