Hi everyone!
I have a question that I would really appreciate it if you help me out, please!
I have a list of data that I would like to know how to find out the gene's names. The dataset which is a text file has the following format:
Segment Count First End
0 258 1_1960674 1_2013259
1 85 1_3057480 1_3257840
2 185 1_3340901 1_3783903
3 215 1_209363247 1_209995470
In this dataset, the first column is the number of segment, the second column is representing the number of SNPs per segment, and the third and fourth columns are representing the smallest and largest SNPs number for each segment. I should note that the values of the third and fourth columns are the combination of the chromosome number and its position Now, how can I understand the gene names?
Thank you very much
Hi ari_sh70 , I've changed the 'tag' of your post to Question as the 'Tutorial' one is reserved for tutorials where people explain or showcase the use of a tool or pipeline.
dear ari_sh70
there's a few shortcomings to your question:
Ok sure, thanks for the tips
My apologies, I just realized you do show a few "lines" within that one line, but it's really hard to read...
You are totally right! This is my first time writing a post here. Thanks again for telling me how to do that!
what exactly do you mean by that? are you looking to find the genes that are located in that region?
Yes, exactly I am looking for that...
BEDtools (more specifically bed-intersect ) will be your friend.
With a little reformat of those columns and given you have a gff (or bed) file of the annotation, this should be pretty straightforward
Thank you very much for your answer, can you tell me a bit more information about it, please?
sure, can you however first confirm that you have an annotation of the genes in gff or bed format
Thank you. Firstly, I would like to apologize for my delay respond because I am a new user and the system did not let me to reply anymore yesterday. I want to do the annotation for the SNPs based on their location as I brought the data in my post
No worries.
see my answer below