I'm new to sequence analysis, so I might have posted a redundant question. If so, please refer me to the correct place. From my initial search, I couldn't find answers to my question.
Is there a computational way (I prefer to use R, but if there's any more useful tool, please feel free to suggest!) to find corresponding nucleotide locations (i.e. 123682765-123683049) from a list of chromosomal regions (i.e. 5p14-15, 5q13-15, 5q31-32, etc)?
My input data would be: 5p14-15, 5q13-15, 5q31-32, etc
I'd like to get result in a dataframe format: 1st column listing nucleotide start location and 2nd column would be listing nucleotide end location.
Also, I'm doing ATACseq analysis, and if you know any beginner friendly learning materials/videos, I'd love to learn more.
Thank you.
Hi, thank you so much for your link and the name "CytoBand"! It's really useful. I was able to find useful questions such as following.
Cytogenic Location To Genome Coordinates In R
Genomic coordinates for Cytogenetic bands with R
I started using RStudio's Terminal. But I'm unfamiliar with it. Is there a stepwise instruction somewhere?
Should I use the code below? Is this what you mean by
\d+[pq]
?Or
Should hg19 be hg38 (what we used)? Also, above command gives "mysql: command not found"
Thank you again for your help
These questions are a lot more basic than the methods question you asked at the beginning. It looks like you're going to need to install mysql and learn a bit of R (and some regular expressions), and I cannot help you with that. If you're not familiar with mysql and R, please involve someone near you who can help you with that.
Thank you for your reply. I'm familiar with R and regular expression in statistical context and have learned SQL and HiveQL basics, but not mysql and terminal in a genomics context. I want to learn more about this process for sure. What kind of class would you suggest for this kind of process? I personally have a limited access to people who are knowledgeable of what I want to learn. Thank you.
If you know R, regex and SQL in any context, they can be applied here. It's the data that is different so you should be fine.