e.g. House price prediction using linear regression
e.g. spam filtering model
Scikit-Multiflow, Jubatus
Feature | Batch Learning | Online Learning |
---|---|---|
Data Handling | Choose the method that processes the whole dataset at once or in portions of high size. | feeds data incrementally, that is, through the flow of one instance or one small batch at a time. |
Training Frequency | On a timetable basis, fixed and cyclic (e. g. daily, at weekly or monthly basis). | Ongoing, where datasets, in scales larger than the current ones are obtained in the future. |
Initial Dataset | It requires the whole dataset to be present before employment for training. | Moves from an initial set of test questions and then is altered over time with new test questions. |
Adaptability | It has a weaker ability to update its model and less resistant to new incoming data; must update from time to time. | It is very flexible; it will clean the data set immediately if new data is introduced. |
Resource Consumption | During the training phase, SKM requires a high computational resource since it needs to compare the variables of all samples. | May influence less demand at a particular period; it spreads the usage of resources in time. |
Model Performance | Gets high accuracy if it was trained with enough data. | Is fast in terms of convergence but could, in some cases be tuned for precision. |
Concept Drift Handling | We may have a problem with discrepancies on data distribution in the consecutive training phases. | Good at dealing with concept drift, which implies its flexibility in coping with new incoming distributions. |
Update Mechanism | Must rest equally from scratch to make an update. | It becomes updated piecemeal in the form of a new data instance. |
Deployment | The model is used after it has been trained and has no ability to modify itself until it undergoes training phase again. | The model is always in the process of deployment as well as training within the company and being oriented towards constant improvement. |
Use Case Suitability | Apparently appropriate when used in setting where the action is fixed, employing stable data distributions. | Fast-paced systems that involve frequent changes in data are likely to benefit from such tuning. |