Batch learning

  1. involves training the dataset ml model on entire dataset at once
  2. trained in single or multiple pass
  3. dataset availability is crucial, entire dataset must be present
  4. requires intensive resource
  5. high performance required cuz trained on all available data
  6. usually used when dataset is small and can fit into memory

e.g. House price prediction using linear regression

Online /Incremental/Sequential learning

  1. model is trained by feeding data instances one at a time or in small batches
  2. faster training cuz chunk sized data is used
  3. these models are very adaptive as the model can adapt to new data patterns over time
  4. accuracy might not be it’s strongest feature initially but can be improved over time
  5. useful for real-time data or data which changes over time (concept drift)
  6. can be used to train system on huge dataset that can’t fit machine’s memory (out of core learning)
  7. garbage in, garbage out (if bad data is fed to the system)

e.g. spam filtering model

Scikit-Multiflow, Jubatus


Feature Batch Learning Online Learning
Data Handling Choose the method that processes the whole dataset at once or in portions of high size. feeds data incrementally, that is, through the flow of one instance or one small batch at a time.
Training Frequency On a timetable basis, fixed and cyclic (e. g. daily, at weekly or monthly basis). Ongoing, where datasets, in scales larger than the current ones are obtained in the future.
Initial Dataset It requires the whole dataset to be present before employment for training. Moves from an initial set of test questions and then is altered over time with new test questions.
Adaptability It has a weaker ability to update its model and less resistant to new incoming data; must update from time to time. It is very flexible; it will clean the data set immediately if new data is introduced.
Resource Consumption During the training phase, SKM requires a high computational resource since it needs to compare the variables of all samples. May influence less demand at a particular period; it spreads the usage of resources in time.
Model Performance Gets high accuracy if it was trained with enough data. Is fast in terms of convergence but could, in some cases be tuned for precision.
Concept Drift Handling We may have a problem with discrepancies on data distribution in the consecutive training phases. Good at dealing with concept drift, which implies its flexibility in coping with new incoming distributions.
Update Mechanism Must rest equally from scratch to make an update. It becomes updated piecemeal in the form of a new data instance.
Deployment The model is used after it has been trained and has no ability to modify itself until it undergoes training phase again. The model is always in the process of deployment as well as training within the company and being oriented towards constant improvement.
Use Case Suitability Apparently appropriate when used in setting where the action is fixed, employing stable data distributions. Fast-paced systems that involve frequent changes in data are likely to benefit from such tuning.