Recognizing Handwritten Digits Using Scikit-learn In Python

Aditya Kamat
4 min readFeb 21, 2021

The handwritten digit recognition is the ability of computers to recognize human handwritten digits. It is a hard task for the machine because handwritten digits are not perfect and can be made with many different flavors. The handwritten digit recognition is the solution to this problem which uses the image of a digit and recognizes the digit present in the image.

Recognizing handwritten text is a problem that can be traced back to the first automatic machines that needed to recognize individual characters in handwritten documents. Think about, for example, the ZIP codes on letters at the post office and the automation needed to recognize these five digits. Perfect recognition of these codes is necessary in order to sort mail automatically and efficiently. Here we are going to analyze the digits data-set of the Sci-Kit learn library using Jupyter Notebook.

First we begin with importing the required libraries.

Now we have successfully load the Digits dataset into the notebook. Now we call the DESCR attribute by which we can read lots of information.

The images of the handwritten digits are contained in a digits.images array. Each element of this array is an image that is represented by an 8x8 matrix of numerical values that correspond to a grayscale from white, with a value of 0, to black, with the value 15.

Now we will use Matplotlib library to check the content of the result.

The numerical values represented by images, i.e. the targets, are contained in the digit.targets array

As we can see this dataset contains 1797 elements. We will consider first 1791 as training set and remaining 6 as validation set.Here we can see in detail these six handwritten digits by using the matplotlib library:

Training svm estimator:

We will test the estimator and interpret the six digits of the validation set and then comparing them with them with actual digit.

As we can see targeted and predicted values are same. Hence estimator is able to recognize and interpret handwritten digits and interpret all six digits of validation set.

Now we will use KNN classifier to predict the accuracy of our model.

Now let us consider 2 more cases for further checking our model:

last case:

Now lets’s build a confusion matrix and classfication report.

Conclusion- Thus we successfully imported the dataset and build a model using Scikit-Learn. We were successful in training the model and make prediction using it.

By using 3 cases we successfully predicted the accuracy to be 98.33% in our case

“I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Experience. Thank you www.suvenconsultants.com".

--

--