syntax error while building up linear regression model
2.6 years ago

Hello I am trying to calculate correlation coefficient, and I am trying to write a script but it gives me syntax error.

Basically I have some data and I want to see what is the correlation between these data I have.

But I am encountering some python syntax error that I cannot figure out how to fix it.

My code looks like this:

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20.0, 10.0)

#READING data
data = pd.read_csv ('benchmarking.csv')
print (data.shape)
data.head()
#Collecting X and Y
X = data['logAUC'].values
Y = data['RMSD'].values

#Mean X and Y
mean_x = np.mean(X)
mean_y = np.mean(Y)

print (mean_x, mean_y)

#Total number of values
n = len(X)

# Using the formula to calculate b1 and b2

numer = 0
denom = 0

for i in range(m):

numer += (X[i] - mean_x * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
b1 = numer/denom
b0 = mean_y - (b1 * mean_x)

print (b1, b0)


This is the error I get:

 denom += (X[i] - mean_x) ** 2
^
SyntaxError: invalid syntax


My input data looks like this:

   Protein name       logAUC        RMSD
0   Metaloellastase    47.96         0.61
1   FGF1               23.44         0.72
2   FKBP1A             38.98         1.16
3   UDP                15.45         0.58
4   MDM2               18.91         1.42

pymol correlation coefficient linear regression • 1.1k views
2
Your line starting numer += .... is missing a closing bracket, I think the error is just being misleading as its gone to the next line in search of the closing brace so it looks like the error is with the denom... line.

