Figure 1From: Machine learning on normalized protein sequencesWorkflow of the applied procedure. The protein sequences are first encoded as vectors of numerical descriptor values, e.g. with the hydropathy values of Kyte and Doolittle [30]. These vectors are normalized to a fixed length by applying the described interpolation methods. Finally, the normalized encoded sequences are used as input for the random forests in the classification.Back to article page