I have a question about doing a for loop in R, I would be very grateful if you could let me know your ideas. I'm working with NGS data, I have calculated r2 values to estimate linkage disequilibrium but I want to calculate LD decay for every single SNP in each contig.
This is the first 3 rows of my data:
scaffold94_798049_802097 999 NA tscaffold94_798049_802097 999 NA 1 tscaffold94_798049_802097 999 NA tscaffold94_798049_802097 1029 NA 1 tscaffold94_798049_50222 2011 NA tscaffold94_798049_802097 1029 NA 1
the first and third column are contig names. How can I make a loop to keep only those rows that the name of first and third columns are identical (means that only those two SNP located on the same contig)?