Scoring

class DOPE_Score(mol)[source]

This class contains methods to calculate dope without saving to save and load PDB files for every structure. Atoms in a biobox coordinate tensor are mapped to the coordinates in the modeller model directly.

Parameters:

mol (biobox.Molecule) – One example frame to gain access to the topology. Mol will also be used to save a temporary pdb file that will be reloaded in modeller to create the initial modeller Model.

get_all_dope(coords, refine=False)[source]

Expect a array of frames. return array of DOPE score value.

Parameters:
  • coords (numpy.ndarray) – shape [B, N, 3]

  • refine (bool) – (default: False) If True, relax the structures using a maximum of 50 steps of Conjugate Gradient descent

Returns:

float array shape [B]

Return type:

np.ndarray

get_dope(frame, refine=False)[source]

Get the dope score. Injects coordinates into modeller and uses mdl.build(build_method=’INTERNAL_COORDINATES’, initialize_xyz=False) to reconstruct missing atoms. If a error is thrown by modeller or at any stage, we just return a fixed large value of 1e10.

Parameters:
  • frame (numpy.ndarray) – shape [N, 3]

  • refine (bool) – (default: False) If True, relax the structures using a maximum of 50 steps of ConjugateGradient descent

Returns:

Dope score as calculated by modeller. If error is thrown we just simply return 1e10.

Return type:

float

class Parallel_DOPE_Score(mol, processes=-1, context='spawn', **kwargs)[source]

a multiprocessing class to get modeller DOPE scores. A typical use case would looke like:

score_class = Parallel_DOPE_Score(mol, **kwargs)
results = []
for frame in coordinates_array:
    results.append(score_class.get_score(frame))
.... # DOPE will be calculated asynchronously in background
#to retrieve the results
results = np.array([r.get() for r in results])
Parameters:
  • mol (biobox.Molecule) – biobox molecule containing one example frame of the protein to be analysed. This will be passed to DOPE_Score class instances in each thread.

  • processes (int) – (default: -1) Number of processes argument to pass to multiprocessing.pool. This controls the number of threads created.

  • **kwargs – additional kwargs will be passed multiprocesing.pool during initialisation.

get_score(coords, **kwargs)[source]
Parameters:

coords (np.array) – # shape (N, 3) numpy array

class Ramachandran_Score(mol, threshold=0.001)[source]

This class contains methods that use iotbx/mmtbx to calulate the quality of phi and psi values in a protein.

Parameters:
  • mol (biobox.Molecule) – One example frame to gain access to the topology. Mol will also be used to save a temporary pdb file that will be reloaded to create the initial iotbx Model.

  • threshold (float) – (default: 1e-3) Threshold used to determine similarity between biobox.molecule coordinates and iotbx model coordinates. Determine that iotbx model was created successfully.

get_score(coords, as_ratio=False)[source]

Given coords (corresponding to self.mol) will calculate Ramachandran scores using cctbux ramalyze module Returns the counts of number of torsion angles that fall within favored, allowed, and outlier regions and finally the total number of torsion angles analysed. :param numpy.ndarray coords: shape (N, 3) :returns: (favored, allowed, outliers, total) :rtype: tuple of ints

class Parallel_Ramachandran_Score(mol, processes=-1)[source]

A multiprocessing class to get Ramachandran scores. A typical use case would looke like:

score_class = Parallel_Ramachandran_Score(mol, **kwargs)
results = []
for frame in coordinates_array:
    results.append(score_class.get_score(frame))
    # Ramachandran scores will be calculated asynchronously in background
...
# to retrieve the results
results = np.array([r.get() for r in results])
favored = results[:,0]
allowed = results[:,1]
outliers = results[:,2]
total = results[:,3]
Parameters:
  • mol (biobox.Molecule) – biobox melucel containing one example fram of the protein to be analysed. This will be passed to Ramachandran_Score instances in each thread.

  • processes (int) – (default: -1) Number of processes argument to pass to multiprocessing.pool. This controls the number of therads created.

get_score(coords, **kwargs)[source]
Parameters:

coords – # shape (N, 3) numpy array