LabelSpreading

seqlearner.SemiSupervisedLearner.label_spreading(kernel, gamma, n_neighbors, alpha, max_iter, tol, n_jobs)

LabelSpreading model for semi-supervised learning This model is similar to the basic Label Propagation algorithm, but uses affinity matrix based on the normalized graph Laplacian and soft clamping across the labels.

Arguments

  • kernel: String, String identifier for kernel function to use or the kernel function itself. Only 'rbf' and 'knn' strings are valid inputs. The function passed should take two inputs, each of shape [n_samples, n_features], and return a [n_samples, n_samples] shaped weight matrix.
  • gamma: Float, Parameter for rbf kernel
  • n_neighbors: Positive integer, Parameter for knn kernel
  • alpha: Float, Clamping factor
  • max_iter: Positive integer, Change maximum number of iterations allowed
  • tol: Float, Convergence tolerance: threshold to consider the system at steady state
  • n_jobs: Positive integer, The number of parallel jobs to run

Example: predict the unlabeled sequences

from sklearn.model_selection import train_test_split
from seqlearner import MultiTaskLearner
labeled_path = "../data/labeled.csv"
unlabeled_path = "../data/unlabeled.csv"
mtl = MultiTaskLearner(labeled_path, unlabeled_path)
encoding = mtl.embed(word_length=5)
X, y, X_t, y_t = train_test_split(mtl.sequences, mtl.labels, test_size=0.33)
score = mtl.semi_supervised_learner(X, y, X_t, y_t, ssl="label_spreading")

See Also