2.3 Definitions

To avoid confusion through ambiguity, here are some definitions of terms used in this book:

  • An Algorithm is a set of rules that a machine follows to achieve a particular goal 2. An algorithm can be seen as a recipe the defines the inputs, the output and all the steps required to get from the inputs to the output. Cooking recipes are algorithms, where the ingredients are the inputs, the cooked meal is the output and the preparation and cooking steps are the algorithm instructions.
  • A Machine learning algorithm is a set of rules that a machine follows to learn how to a achieve a particular goal. The output of a machine learning algorithm is a machine learning model. Machine learning algorithms are also called “learner” or “inducer” (e.g. “tree inducer”).
  • A (Machine learning) Model is the outcome of a machine learning algorithm. This can be a set of weights for a linear model or for a neural network plus the information about the architecture. Other names for the rather unspecific word “model” are “predictor” or - depending on the task it solves - “classifier” or “regression model”.
  • Dataset: A table containing the data from which the machine learns. The dataset contains the features and the target. When used for inducing a model, the dataset is called training data.
  • Features: The features/information used for prediction/classification/clustering. A feature is one column in the dataset. Throughout the book, the features are assumed to be interpretable, meaning it’s easy to understand what they stand for. An exception are images where each input feature is a pixel and interpretability works often by graying out bigger parts of the images.
  • Target: The thing the machine learns to predict.
  • (machine learning) Task: The combination of a dataset with features and a target. Depending on the type of the target, the task can be classification, regression, survival analysis, clustering, or outlier detection.
  • Prediction: The machine learning model “guesses” what the target value should be based on given features.
  • Instance: One row in the dataset. Other names for ‘instance’ are: (data) point, example, observation.

  1. “Definition of Algorithm.” 2017. https://www.merriam-webster.com/dictionary/algorithm.