Case Study:
Ai/Machine Learning - Helmet Detection

We built a Helmet Detection App for a client for installation onto an eBike.  We were able to deliver a working product that met their requirements and gave them the peace of mind they needed.

The App was designed for an eBike Fleet Manager in India that needed to protect their business owners from the liability of riders being irresponsible, not wearing helmets. In India cameras are installed beside the roads and check riders are wearing helmets. If they are not then a ticket is sent to the owner of that eBike. As a fleet operator of eBikes, they get the ticket (and the bill) and without direct evidence the rider may not be liable to pay the fine, but the fleet operator has too.

We built the App from scratch using TensorFlow, Google’s Machine Learning platform, coded in Python being one of the best platforms. The App was built on Android, as this has 70% market share and low cost IP rated phones are available globally.
We collected the images by asking people we knew to take video of themselves while wearing a bike helmet and while not wearing a helmet. We then compiled the videos into two separate collections, one for each condition.

To convert a video into images, needed for our model, we used a program called ffmpeg. This program takes a video and splits it up into individual frames. It then saves these frames as images files. Each image being labelled as wearing or not wearing a helmet.
In order to build a model and check its predictive power we split the images into two groups. We randomly selected 70% of the images to be used for training the model and the remaining 30% were used as the test set.

Working with TensorFlow we built a model to predict bikers wearing and not wearing their helmet. TensorFlow is a software library for data analysis and machine learning. It is used to create and train neural networks, which are algorithms that can learn to recognize patterns in data. TensorFlow works by transforming a description of a neural network, described in Python (code) into a mathematical function that can be executed on a computer. This function is called a “graph,” and it can be used to calculate the output of the network for any given set of inputs.

TensorFlow used the training data to optimize the model by adjusting the weights of the neurons in the network. This allowed it to learn how to best predict the output given the input data. It also used the testing data to verify that the model was able to accurately predict the output for new data sets.

In order to change our model in TensorFlow, we used back regression. Back regression is a technique for fitting a linear/logistic regression model to data that contains unknown (but potentially correlated) regressors. We used it to determine the most important features of our data in order to predict the output.
We programmed the model into our App so that we could check the rider’s progress every few seconds. This allowed us to field test the model. The App collected all the images during our field testing and labelled them as either wearing or not wearing.

At first the model was producing many Type 1 and Type 2 errors. But by using the collected images as training data with correct labels we improved the model. The labelling was done manually at first but to scale our team used Amazon’s Mechanical Turk to correctly label images. We created a task on the site that asked workers to identify the objects in each image. We then used the results of those tasks to further train our machine learning model to automatically label images. The model improved its accuracy the more testing we did.

Of concern was the many camera artifacts we discovered that would ruin the image and create errors. We found that the artifacts were caused by lens flare, over exposure, saturation, under exposure and noise. These artifacts exist in all camera images to some degree. But worse in our case as the camera was only a low cost cellphone camera and it was pointed upwards towards the rider and the background that included the bright sky and overhead lighting. These are difficult to remove from the image.

Our workaround was to eliminate these images from the “data set”. In other words we would not count the result when we detected these artifacts. To do this we used multiple techniques:

  1. The App took multiple images in a row, if they all all agreed with the result then and only then would we post the result.
  2. The App processed the images for image quality. One metric for that is how much high frequency data is in the image. Low frequency often means a low quality image.
  3. We checked the image for saturation, especially saturation in the lower or center part of the image where the face and helmet would.
  4. We developed a face recognition model to process the image and ensure it could see a face. If it could recognise a face then we knew the image quality was good enough to accurately determine a helmet (or not).

In the field we expect the helmet detection to improve overtime as we collect more training data. This will overcome the problem of “over fitting”.

Overall the project was a great success and the App worked as expected.

If you have a similar problem that you need solved, contact us and we can discuss how we can help you.

Pete Cooper, CEO

For more info
about this case study
contact us:
Close Bitnami banner