Skip to main content

Table 2 Description of each Characterizer contained in PCP-ML

From: PCP-ML: Protein characterization package for machine learning

Name of characterizer

Brief description of functionality provided

AtchleyFactors

Characterizes five major aspects of an amino acid with real number values. The values were obtained via a statistical analysis of amino acids when looking at polarity, secondary structure, molecular size , amino acid composition and charge. These values were reported in [24].

InterfaceContactPotentials

Characterizes contact potential between two residues. These contact potentials come from a statistical analysis performed on contacts in protein interfaces. They were reported in [18].

BetaContactPotentials

Characterizes the contact potential for two residues in two beta sheets. These values come from a study of contact potentials of residues in cross strand pairings in beta sheets. They were reported in [22].

SSComposition

Determine the percentage of each secondary structure (SS) type in a string representing the secondary structure of the entire protein.

SAComposition

Determine the percentage of solvent accessibility from a string representing the solvent accessibility of the entire protein.

AAComposition

Determine the percentage of each amino acid in a protein sequence.

Hydrophobicity

Characterizes the hydrophobicity of a residue. These values come from a study on hydrophobicity and helical propensity in [23].

CalculateR

Calculates the Pearson correlation coefficient for the elements of two feature vectors.

CalculateCosine

Calculates the cosine between two feature vectors.

ScaledOrderedMean

Calculates the nth ordered mean for the Amino Acid, Secondary Structure or Solvent Accessibility string.

CalculateEntropy

Calculates the Shannon entropy for a vector of probabilities