4.1 Training and Testing

When we develop a Supervised Machine Learning model, no matter what algorithm is involved - regression or classification -, Training and Testing are two of the most important parts of the process.


The process of training puts the "learning" in ML. Basically, the algorithm fits the best shape/pattern to the data, so that by using the insights gained, predictions can be made. In the case of Regression, this is when we fit the ideal curve to the data. Training data is the input to the ML algorithm. This is what the algorithm learns on. Since we're working on supervised learning, there'll be correct values of the predicted characteristic in the training data.


As the name suggests, this step evaluates the accuracy of our Machine Learning model, giving us an indication if we got the model right. We'll discuss methods of evaluation in another chapter, but the important part of testing is testing data. Testing Data is tried on the ML model, and then we have to compare the prediction to the actual value. This is a supervised learning problem, so the testing data also contains the correct outputs. Don't worry if this is too abstract right now, there's more notebooks in this chapter that will give you a better intuition.

Next Section

4.2 Overfitting and Underfitting


Copyright © 2021 Code 4 Tomorrow. All rights reserved. The code in this course is licensed under the MIT License. If you would like to use content from any of our courses, you must obtain our explicit written permission and provide credit. Please contact classes@code4tomorrow.org for inquiries.