Machine Learning and Neural Networks - page 40

 

Vehicle tracking / Traffic monitoring yolov5+deepsort



Vehicle tracking / Traffic monitoring yolov5+deepsort

For everyones comes here from youtube or just want to use my repo for inference. This is an very old project just for learning CV when i was in school. It's been a long time so I forgot everything that I did in this toy project. I've editted from this repo and add some lines of code for some constraints. You guys can use that repo for references. Thanks.

The detections are generated by YOLOv5 are passed to Deep Sort algorithm which tracks the objects.

Code: https://github.com/duongcongnha/Vehicle-tracking

Vehicle tracking / Traffic monitoring yolov5+deepsort
Vehicle tracking / Traffic monitoring yolov5+deepsort
  • 2022.01.21
  • www.youtube.com
Full 34 minutes of Tracking traffic using yolov5 + deepsortVideo: https://youtu.be/wqctLW0Hb_0My repo: https://github.com/duongcongnha/Vehicle-tracking
 

Demo of vehicle tracking and speed estimation at the 2nd AI City Challenge Workshop in CVPR 2018



Demo of vehicle tracking and speed estimation at the 2nd AI City Challenge Workshop in CVPR 2018

Our team from the University of Washington is the winner of Track 1 (Traffic Flow Analysis) at the 2nd AI City Challenge Workshop in CVPR 2018.

The datasets for the 2nd AI City Challenge (2018) are no longer available to the public. However, you may still access the datasets of the 3rd AI City Challenge (2019), or participate in the latest 4th AI City Challenge in CVPR 2020.

They provided a new city-scale dataset for multi-camera vehicle tracking as well as image-based re-identification. They also had a new dataset for traffic anomaly detection. The scale of the datasets and the number of vehicles that are being used for evaluation are both unprecedented. The website of the AI City Challenges is at https://www.aicitychallenge.org/.

Code: https://github.com/zhengthomastang/2018AICity_TeamUW

Demo of vehicle tracking and speed estimation at the 2nd AI City Challenge Workshop in CVPR 2018
Demo of vehicle tracking and speed estimation at the 2nd AI City Challenge Workshop in CVPR 2018
  • 2018.04.09
  • www.youtube.com
In this demo video, the estimated speed of each vehicle is shown in miles/hour.Our team from the University of Washington is the winner of Track 1 (Traffic F...
 

Object Tracking with Opencv and Python

Code : https://pysource.com/2021/01/28/object-tracking-with-opencv-and-python/



Object Tracking with Opencv and Python

The video tutorial on Object Tracking with OpenCV and Python explains the difference between object detection and object tracking and shows how to create an object tracker file for efficient tracking. The video outlines the necessary files required, such as a video and an object tracker file, and shows how to code the main.py file for real-time video display. The tutorial also covers how to remove small elements from the mask and define a region of interest to extract a portion of the image to focus on the vehicles. The video concludes by explaining the process of object tracking and adding unique IDs to the objects to count them correctly. However, the tutorial states that the method is not perfect but serves as an excellent introduction to object tracking, and a more in-depth course on OpenCV and deep learning is recommended.

  • 00:00:00 In this section, the speaker introduces the difference between object detection and object tracking. The former is detecting objects frame after frame while the latter is keeping track of objects throughout frames to trace their history. Object tracking is crucial in various applications such as counting people, tracking object trajectories, and counting cars passing on highways. The speaker then introduces the necessary files to start the project, including a video and an object tracker file. Next, the speaker starts to code the main.py file, beginning with importing the OpenCV library and creating a capture object to read frames from the video. Finally, the speaker shows how to display the video in real-time and create a key event to close the video.

  • 00:05:00 In this section, the speaker discusses object tracking using OpenCV and Python. They explain that for this specific method, a stable camera is required, meaning that the background does not change over time while only the objects in motion change. By using an object detector, the moving objects can be extracted from the stable camera. The detector creates a mask and the goal of the mask is to make everything that is not needed black, while the vehicles that need to be detected are white. The coordinates of the white elements are then extracted and the ones that are too small are removed. Then, using OpenCV's find contours function, the boundaries of the white objects are found and drawn on the frame in green.

  • 00:10:00 In this section, the video tutorial explains how to remove small elements from the mask by calculating the area of the contour and removing any contour with an area less than 100 pixels. Then, the tutorial discusses how to define a Region of Interest (ROI) to extract a portion of the image to focus only on the vehicles on that specific part of the road, making the detection process much easier. The tutorial outlines how to define the ROI by specifying the position, height, and width of the section that contains the vehicles we want to track. The tutorial concludes by explaining how to apply the mask to the ROI to perform object tracking only on that specific area.

  • 00:15:00 In this section of the video, the speaker applies the object detection algorithm to the ROI instead of the entire frame and shows how the detection has improved. However, there are still issues with false positives and the detection algorithm. To improve object detection, the speaker adjusts the size of the history and the var threshold, which affects the amount of false positives. The speaker then extracts the rectangle surrounding each object detected and draws it on the ROI in green. Lastly, the speaker removes the shadows from the image by only keeping the white elements in the image using a threshold operator, resulting in more accurate object detection.

  • 00:20:00 In this section, the video explains the process of object tracking using OpenCV and Python. It is necessary to import the "tracker.py" file and create a "tracker" object for efficient tracking. The "tracker" collects bounding boxes of objects and saves them in an array named "detections." Each bounding box has its distinct set of x, y position, width, and height values. These boxes must have a minimum value of 100, so the tracker discards irrelevant objects. A series of detected objects are displayed frame by frame, and the detections array is passed to the tracker object. The tracker then returns the history and unique IDs of each object tracked. The initial step for object tracking is object detection, and it is only possible if an object is detected in frame one.

  • 00:25:00 In this section, the video discusses the addition of unique IDs to the object tracking feature in OpenCV and Python. Each object now has a different ID, preventing mistakes from being made when counting objects. By using the IDs, objects can be counted correctly, and a line can be drawn to detect when objects cross it. The video ends by explaining that the method shown is simple, not perfect, but a good introduction to object tracking. For more in-depth knowledge on object detection and tracking, the creator recommends viewing their video course on OpenCV and deep learning.
Object Tracking with Opencv and Python
Object Tracking with Opencv and Python
  • 2021.01.28
  • www.youtube.com
Source code: https://pysource.com/2021/01/28/object-tracking-with-opencv-and-python/You will learn in this video how to Track objects using Opencv with Pytho...
 

Computer vision roadmap | How to become a computer vision engineer



Computer vision roadmap | How to become a computer vision engineer

This video provides a roadmap for becoming a computer vision engineer, dividing the learning process into four modules: Python and OpenCV, robotics and low-level programming, artificial intelligence, and advanced topics. The importance of hands-on projects involving physical devices such as robotics and Arduinos is emphasized, along with the need to have a basic understanding of mathematics. The speaker also discusses competitions and projects that can help aspiring computer vision engineers hone their skills. In addition, the importance of soft skills such as documentation and knowledge of software tools like Git and Docker is highlighted. Finally, it is recommended to obtain degrees in mathematics, data science, computer science, or engineering, although learning independently through projects and courses is also possible.

  • 00:00:00 In this section of the video, the speaker presents a computer vision roadmap that covers everything a person needs to learn to become a computer vision developer. The roadmap answers three main questions: what skills are required, what projects can you work on to acquire these skills, and what college degrees are available for learning these skills. The roadmap is divided into four models, with the first model covering Python and OpenCV, which are essential for solving 70% of computer vision problems. The second model is about Robotics and low-level programming, which includes working with C++ and electronics. The third and fourth models cover artificial intelligence and advanced topics, respectively. The speaker suggests beginner projects for each model, such as working on a line follower robot or a QR code reader to build knowledge and skills.

  • 00:05:00 In this section of the video, the speaker discusses the importance of working on projects that involve physical devices such as robotics and Arduinos. The speaker explains that programming for such devices requires a different approach because the same code can produce different results due to factors such as sensor noise and lighting changes. The third module the speaker mentions focuses on artificial intelligence and machine learning using the popular Python library Scikit-learn, which can be used for tasks such as image classification and object detection. Finally, the speaker also emphasizes the importance of having a basic understanding of mathematics and how it relates to computer vision.

  • 00:10:00 In this section, the speaker explains the importance of dividing the environment into different regions and how it can be used to detect situations like lane crossings while demonstrating a simple image with different regions separated by basic geometry. He emphasizes the need for computer vision engineers to be fluent in mathematics and recommends learning basics in image classification and object detection, secret learn, and geometry, before moving onto more advanced skills in machine learning, mathematics, and electronics. He suggests projects such as generative adversarial networks and implementing state-of-the-art papers in computer vision to learn advanced machine learning, while taking up advanced Arduino projects or participating in robotics competitions to improve electronics knowledge.

  • 00:15:00 In this section, the speaker discusses various competitions and projects that can help aspiring computer vision engineers to hone their skills. These competitions involve building robots and agents that can perform complex tasks and compete with other teams. The speaker emphasizes the importance of learning the fundamentals of computer vision, including Python and OpenCV, which can help solve 70% of the available problems. The four modules mentioned are: fundamentals, robotics, artificial intentions, and advanced modules. While the advanced module is important, the speaker suggests focusing on the other modules first, as they cover the essential skills needed to become a successful computer vision engineer.

  • 00:20:00 In this section, the speaker discusses the various software related skills that a computer vision developer should have, besides having knowledge of computer vision, machine learning and Python. These skills include knowing the basics of tools like Git and Docker, as well as cloud providers like AWS or Google Cloud. Additional skills that are considered valuable include web or mobile development, knowledge of JavaScript, and knowing how to write proper documentation such as a readme file for a GitHub repository or documenting an entire process. While not all of these skills are a must, they can be beneficial in showing off a developer's work and solving various problems.

  • 00:25:00 In this section, the speaker emphasizes the importance of documentation for computer vision developers as it makes things simpler for everyone. They also suggest that every developer, not just computer vision developers, should master documentation as it is a crucial skill. The speaker then highlights the significance of being familiar with various software tools and encourages aspiring computer vision engineers to learn as much as they can. For those interested in taking a college degree, the speaker recommends degrees in mathematics, data science, computer science, electrical engineering, or any engineering degree, as they provide the foundation to approach problems and come up with solutions. However, the speaker notes that it is entirely possible to learn everything independently by taking projects and courses.
Computer vision roadmap | How to become a computer vision engineer
Computer vision roadmap | How to become a computer vision engineer
  • 2022.11.03
  • www.youtube.com
0:00 - Start0:20 - Who am I?0:46 - Video roadmap01:28 - Skills: 4 modules2:04 - Fundamentals3:46 - Robotics / low level programming7:38 - Artificial Intellig...
 

IMAGE CLASSIFICATION with Yolov8 custom dataset | Computer vision tutorial

Code: https://github.com/computervisioneng/image-classification-yolov8



IMAGE CLASSIFICATION with Yolov8 custom dataset | Computer vision tutorial

In this section, the presenter explains how to choose the best model for the project and makes suggestions between choosing the last model produced from the training process or selecting the model with the highest accuracy. The decision depends on various factors, including data, problem, use case, and training process. The speaker also explains how to make all inferences and predict new samples by importing YOLO and specifying the absolute path to the model, using an image from the local computer and showing the results. The results provide a lot of information, including the probabilities and the names of the categories the image classifier was trained on.

  • 00:00:00 In this section, the YouTuber demonstrates how to build an image classifier using YOLOv8 on a custom data set. He starts by showing the data set, which includes images of clouds, rain, shine, and sunrises. He also explains the file structure that YOLOv8 expects for data input. Felipe then proceeds to explain three different ways to train an image classifier using YOLOv8: the first is using a Python script, the second is using a Google Colab Notebook, and the third is using darknet, which is the underlying framework for YOLOv8.

  • 00:05:00 In this section, the tutorial goes over the dependencies needed for training an image classification model using Yolov8 and how to install them using the Python interpreter in PyCharm. The two main packages needed are Ultralytics and NumPy, with Ultralytics being crucial in importing the necessary data to train the model. The tutorial then goes over the code needed to train the model and requires setting the absolute path to the data in the correct format. A test run is also done for only one Epoch to ensure everything is working properly before continuing with further training.

  • 00:10:00 In this section, the speaker shows where the results from the training process are located and explains the folder structure within the runs folder. He then demonstrates how to train an image classifier using Yolov8 through the command line using a specific utility and the necessary arguments. He proceeds to show the simplicity of this approach by training the model for just one epoch, and then shows where the results are saved in a new directory labeled "trained 13". Finally, the speaker discusses another way to train the model using Google Collab and provides a brief explanation of the process for setting it up.

  • 00:15:00 In this section, the presenter shows a notebook that has already been created for training Yolov8 models called "train.ipymb". Uploading datasets onto Google Drive is an essential step for users to access their data from Google Collab since the process requires users to execute a cell that connects Google Collab with Google Drive. Knowing the path to a user's dataset in Google Drive is critical since users need to enter this path into their notebooks to use their datasets properly. Lastly, the presenter provides simple commands that users can execute to install the necessary libraries needed to train the Yolov8 models.

  • 00:20:00 In this section of the video, the instructor demonstrates how to train an image classifier using YOLOv8 and a custom dataset, and how to copy the results obtained in a Google Colab environment to Google Drive. They run the training process for one epoch, copy the results to the Google Drive, and then run the training process for 20 epochs. Finally, they show how to analyze whether the training process resulted in a good image classifier. The instructor emphasizes the importance of setting the data directory variable properly and suggests copying the results into Google Drive to simplify downloading and analyzing them.

  • 00:25:00 In this section, the results of the training process for a custom dataset using YOLOv8 are analyzed. The files saved in the local computer are examined, including a configuration file listing all the EPA parameters used during the training process and a results file that includes training and validation loss, as well as accuracy. A python file that plots loss and accuracy across different epochs was used to determine if the loss is going down and if the accuracy is going up through the training process. It is concluded that a simple but powerful approach is to check if the training loss and the validation loss are going down while the accuracy is going up.

  • 00:30:00 In this section, the speaker explains how evaluation accuracy looks when we increase the number of epochs, showing that the accuracy increases and then plateaus. Different parameters and a more customized training could be used to improve the process to obtain a better model. The "weights" directory is also explained, where two different files are saved: "last.pt" and "best.pt". The "last.pt" file is the model produced at the end of the last epoch of the training, while the "best.pt" file is the model produced at the end of the best training epoch. In practice, the last model would be used to obtain the final results.

  • 00:35:00 In this section, the speaker discusses how to choose the best model for the project and makes suggestions between choosing the last model produced from the training process or selecting the model with the highest accuracy. The decision depends on various factors, including data, problem, use case, and training process. The speaker also explains how to make all inferences and predict new samples by importing YOLO and specifying the absolute path to the model, using an image from the local computer and showing the results. The results provide a lot of information, including the probabilities and the names of the categories the image classifier was trained on.

  • 00:40:00 In this section of the video, the presenter explains how to use a dictionary to identify the classifications of an image using YOLOv8. The results of the image classifier are integers corresponding to different categories, and a dictionary is used to identify which category each integer represents. Additionally, the probability vector of all the different classes for a given image is obtained, and the presenter demonstrates how to identify the maximum probability and its corresponding category using numpy. The presenter concludes by showing how to classify an image using the YOLOv8 model and suggests subscribing to their channel for more computer vision engineering projects.
 

Train Yolov8 object detection on a custom dataset | Step by step guide | Computer vision tutorial

Code: https://github.com/computervisioneng/train-yolov8-custom-dataset-step-by-step-guide



Train Yolov8 object detection on a custom dataset | Step by step guide | Computer vision tutorial

This video tutorial provides a step-by-step guide on how to train an object detector using YOLOv8 on a custom dataset for computer vision applications. The tutorial covers important aspects such as data collection, annotation, formatting, and training, as well as the significance of analyzing loss function plots and real-life examples to evaluate the model's performance. The tutorial emphasizes using the CVAT tool for annotation, adhering to the required YOLOv8 data format, and creating a configuration YAML file. The tutorial also demonstrates using Google Colab and PyCharm for training and testing the object detector.

The video demonstrates the performance of a YOLOv8 object detection model trained on an alpaca dataset. The speaker uses a Python script to load videos and execute predictions, showing almost perfect detection in some cases and stable detection in others, with some missed detections in one video. The speaker recommends analyzing the training and validation loss functions and deciding if there is room for improvement before training the model again to obtain a more powerful and better-performing model.

  • 00:00:00 In this section of the video, the presenter explains the importance of collecting data when training an object detector using YOLOv8, as well as the different ways one could collect data such as taking their own photographs or using a publicly available dataset like the Open Images Set. The presenter uses an alpaca dataset for demonstration purposes, and viewers are shown examples of the images contained in the dataset. Additionally, the presenter mentions that they can provide viewers with scripts to download and format the data from the Open Images Set in the exact way needed for the tutorial.

  • 00:05:00 In this section of the video, the narrator discusses the process of data annotation using the CVAT tool, which is a labeling tool that can be used for object detection and computer vision-related annotation tasks. The narrator demonstrates how to use the CVAT tool to annotate a set of images for a custom dataset, explaining how to create a new project and task, upload images, and add labels for the objects to be detected. Although the narrator already has the annotations for the images used, they provide a step-by-step guide for those who need to annotate their data. The video also highlights the advantages and disadvantages of different annotation tools while emphasizing the benefits of using CVAT.

  • 00:10:00 In this section, the video creator shows how to label objects on an image dataset for object detection using the YOLOv8 algorithm. The creator uses images of alpacas and demonstrates how to enclose the object in a bonding box, allowing the algorithm to detect and identify the objects. They also explain the importance of annotating your own data to be more familiar with the different situations that may arise during annotation and provide clarity on the objects being detected. The creator notes that while data annotation can be done using external services, it is critical to annotate some of the images yourself for a better understanding of the data.

  • 00:15:00 In this section, the importance of criteria for annotating images is discussed, with the example of an alpaca dataset. The different ways of defining bounding boxes for objects are also shown, including fitting the box tightly to the object or having a more relaxed bounding box. The annotation tool cb80 is highlighted as a useful and easy-to-use web application for formatting and annotating data, emphasizing the significance of data annotation in training object detection algorithms.

  • 00:20:00 In this section of the video, the presenter explains the specific data format required for YOLOv8 in order to train an object detector. Every algorithm in machine learning has its own data input requirements, and YOLO needs the data in a very specific format. The presenter demonstrates how to export annotated data from CV80 and convert it into YOLO format. The YOLO format requires the annotations and corresponding images to have the same name and different file extensions, with each bounding box annotation containing a class number and four float numbers that define the center position, width, and height of the bounding box. It is emphasized that having data in the correct format is crucial for training an object detector.

  • 00:25:00 In this section, the speaker explains how to structure the data for training object detection using YOLOv8 on a custom dataset. The first step is to create two directories: one for images and one for labels, with specific directory names that cannot be changed. Within the images directory, all images that will be used for training should be located, and a train directory can be used for convenience. Within the labels directory, there should be annotation files for each image, following a standard format where each row specifies the class ID, X/Y position of the center of the bounding box, and the width and height of the bounding box. The speaker also explains how to use additional directories for validation or test data, or for different versions of the data.

  • 00:30:00 In this section, the video tutorial discusses the process of training your object detector with YOLO V8 using either Python or the command line utility. The YOLO V8 official repository offers the option to train with a few Python instructions or by running a command in the terminal. The tutorial then goes through a step-by-step guide on how to train an object detector using YOLO V8 in a PyCharm project, including importing the Ultralytics library and creating a configuration YAML file. The tutorial also highlights that users can choose between working in a local environment or Google Colab.

  • 00:35:00 In this section of the video, the speaker discusses the configuration file for training an object detector using YOLO V8 on a custom data set. They explain how to set the names of the different classes being detected, the absolute path to the directory containing the images and annotations, and the relative path to where the images are located. The speaker also mentions the two keywords, "train" and "val" which refer to the training and validation data respectively. They emphasize the importance of using an absolute path instead of a relative path to avoid any issues. The speaker then demonstrates how to execute the file and begins the training process for one epoch, emphasizing the need to wait until the process is completed. Finally, they show the output files generated after the training process is completed.

  • 00:40:00 In this section of the video tutorial, the speaker shows the batches of images used to train the YOLOv8 object detection algorithm, as well as similar images for validation. The speaker explains that they are conducting a shallow and dummy training, and they show the files used for this purpose. The speaker then shows how to train the algorithm from the command line using the terminal, and also from a Google collab notebook. The notebook only consists of five cells, and the speaker demonstrates how simple it is to upload data, mount Google Drive, and train the algorithm by just specifying the relative path to the location of the data.

  • 00:45:00 In this section, the video tutorial shows how to use Google Colab to train a YOLOv8 object detector on a custom dataset. The process involves saving a specific variable and executing the "pip install ultralytics" command in the Google Colab environment through the exclamation point command. The tutorial shows how to input the same lines of code used in the local environment and how to specify an absolute path to the Google Drive directory. Once the files are uploaded and the commands are executed, the training process is similar to that in the local environment. The tutorial instructs users to wait for the execution to complete and remember to execute the command that copies the results directory to the chosen directory in Google Colab.

  • 00:50:00 In this section, the video tutorial focuses on testing the performance of the Yolov8 object detector that was trained on a custom dataset. The first step in testing the model involves going to the directory where all the information regarding the training process was saved. The video then showcases different outputs and plots that can be used to analyze the performance of the model, including confusion matrices, F1 confidence curves, and loss function plots. The video recommends focusing on the loss plots, ensuring that all the losses are going down, as this is a simple yet powerful way to analyze the model's performance. Different situations in the loss function plots are also discussed, such as a flatlining loss function, which could indicate that the model has learned everything it needed to learn.

  • 00:55:00 In this section of the video, the presenter discusses analyzing the loss function to gain insights into the training process of an object detection model. If the loss function goes down, everything is going well, but if it goes up, there is a problem with the training. Looking at the plots and analyzing this information is more about research, but the presenter proposes a more visual way to evaluate the model's performance, which is to look at real examples and see how the model performs on real data. The presenter shows examples of the model detecting real-life alpacas in videos, which are completely unseen data to test the performance of the model. While the results are not perfect, the performance is good, and the presenter argues that this visual evaluation is more relevant to a freelancer working on a client's project since the client may not care about the plots and metrics, but rather want to see how the model performs on real data.
  • 01:00:00 In this section, the speaker demonstrates the performance of the YOLOv8 object detection model trained on an alpaca dataset on unseen data through a Python script that loads videos and executes predictions. The detections seem to work well, with an almost perfect detection in some cases, and a stable detection in others, except for some missed detections in one video. The speaker advises analyzing the training and validation loss functions, checking their downward trend, and the existence of room for improvement to decide whether to train the model again, which is what the speaker does to get a more powerful and better-performing model.
Train Yolov8 object detection on a custom dataset | Step by step guide | Computer vision tutorial
Train Yolov8 object detection on a custom dataset | Step by step guide | Computer vision tutorial
  • 2023.01.30
  • www.youtube.com
In this video I show you a super comprehensive step by step tutorial on how to use yolov8 to train an object detector on your own custom dataset!Code: https:...
 

Yolov8 object detection + deep sort object tracking | Computer vision tutorial

Code: https://github.com/computervisioneng/object-tracking-yolov8-deep-sort



Yolov8 object detection + deep sort object tracking | Computer vision tutorial

This video tutorial demonstrates how to implement an object detection and tracking system using YOLOv8 and deep sort. The presenter goes through the process step-by-step, from setting up the environment, loading and detecting objects in a video using YOLOv8, to tracking the detected objects over time using deep sort. They also explain the importance of creating the correct file structure and installing the necessary dependencies. Once the detections are extracted from YOLOv8, Deep Sort object tracking is applied to keep track of the detected objects over time. Finally, the presenter explains how to save the final output frames into a video file and encourages viewers to modify the detection threshold for better accuracy.

  • 00:00:00 In this section of the video, the host introduces the project of working with object detection using YOLOv8 and object tracking using deep sort. The goal is to create a tracking algorithm that can keep track of people moving in a video and use bounding boxes of different colors to indicate objects being tracked. The host shows the YOLOv8 and deep sort repositories that will be used, as well as the video data that will be tested. Finally, the host introduces a file called "tracker.py," which is a wrapper around deep sort that simplifies the object tracking process by using high-level functions.

  • 00:05:00 In this section of the video, the presenter explains how to set up the environment for the tutorial by cloning the necessary repository, downloading the required files, and importing the necessary libraries. They explain the importance of creating the correct file structure and installing the correct dependencies to ensure compatibility with the code. The presenter then demonstrates how to load a video and read frames from it using OpenCV, and how to show the frames in a window.

  • 00:10:00 In this section, the presenter demonstrates how to use the YOLOv8 pre-trained model to produce detections on a video. Using Ultralytics, the pre-trained model is imported and the function is called on a frame of the video. The output detection results are printed, but since the results are in a custom object type, they need to be unwrapped and transformed into a list using a for loop. The final output detections are saved into a new object. Once the code is executed successfully, the results can be used as input data for object tracking algorithms.

  • 00:15:00 In this section of the video tutorial, the speaker demonstrates how to unwrap all of the important detection information from the YOLOv8 object detector and track the detected objects using Deep Sort object tracking. They show how to extract information such as the class ID, confidence value, and coordinates for each detection in order to pass it on to the tracking algorithm. The speaker also shows how to initialize and call the Deep Sort tracker with the detections and frames in order to keep track of the detected objects over time.

  • 00:20:00 In this section, the presenter explains the process of deep sort object tracking, which involves computing features on top of the frame and the detected objects. The algorithm crops the frame and extracts features, and to iterate through all the tracks, the presenter uses a loop to access the bounding box and the track ID. Each person or object detected in the video has an ID assigned that remains stable across all the frames. For every new ID detected, the presenter plots it with a different color using a list of random colors in the visualization.

  • 00:25:00 In this section of the video, the speaker solves the problem of saving the video with all the different frames and detections back to the local computer. A new object, CV2 video writer, is created to save the video with a specified name and location. The video is saved using an MP4 codec and the same frame rate as the original video. The speaker emphasizes that everything is working properly, even though real-time detection is not achieved due to the many objects being uploaded and the execution being run on a CPU. Overall, the problem of object detection using YOLOv8 and object tracking using Deep SORT has been successfully solved.

  • 00:30:00 In this section, the speaker concludes the tutorial by reviewing the final steps needed to implement a successful object detection and tracking system using YOLOv8 and deep sort. They demonstrate how to save the output frames into an out.mp4 file and show its successful completion. Additionally, the speaker mentions the possible misdetections and encourages viewers to modify the detection threshold in order to improve the accuracy of the system for detecting objects such as birds. Overall, this tutorial serves as a helpful guide for anyone looking to implement an object detection and tracking system using computer vision techniques.
Yolov8 object detection + deep sort object tracking | Computer vision tutorial
Yolov8 object detection + deep sort object tracking | Computer vision tutorial
  • 2023.02.13
  • www.youtube.com
Code: https://github.com/computervisioneng/object-tracking-yolov8-deep-sortYolov8: https://github.com/ultralytics/ultralyticsDeep sort: https://github.com/nw...
 

Image segmentation with Yolov8 custom dataset | Computer vision tutorial

Code: https://github.com/computervisioneng/image-segmentation-yolov8

Data: https://drive.google.com/drive/folders/1JvA2IvHBy2QOnAtPtoy4JbEEmtbz5dnK



Image segmentation with Yolov8 custom dataset | Computer vision tutorial

This tutorial covers the process of annotating images and creating a binary mask for semantic segmentation, formatting and structuring the dataset for training with YoloV8, training the model in Google Collab, analyzing the training results, and making predictions using a Python script. The presenter emphasizes the importance of structuring the file system and analyzing the loss function and validation metrics to train a good model for YoloV8 custom dataset. They also discuss the impact of flawed data on prediction quality and recommend using the predictions of the current model to train a new and improved one. Overall, the tutorial provides a comprehensive guide to image segmentation with YoloV8 custom dataset.

  • 00:00:00 In this section of the video, the instructor shows the data set that will be used for semantic segmentation: a set of duck images with corresponding binary masks. The binary mask has all white pixels for an object in the image and all black pixels for everything else. The instructor downloads this data set from the open images asset version 7, and shows the website cvat.ai in case one needs to annotate their own image data. Then, the instructor showcases how to use this tool in order to create a binary mask for an image in just one image annotation.

  • 00:05:00 In this section of the tutorial, the presenter demonstrates how to annotate an image and create a binary mask for semantic segmentation. Using a duck image as an example, they guide the viewer through the process of drawing a polygon to create the mask around the duck's body, including a discussion on whether to annotate certain parts beneath the water as part of the duck or not. The presenter emphasizes that the mask does not have to be pixel-perfect, but it should be accurate enough to create a good mask. Finally, they show how to save and download the annotated data for use in creating a dataset.

  • 00:10:00 In this section, the speaker explains how to format and structure the dataset in order to train a semantic segmentation model with YoloV8. The first step is to download the required data using the Export Task Data Set option and selecting the segmentation mask. Once the data is downloaded, it needs to be organized into a directory structure containing the images, masks, and labels. The masks generated from the images need to be converted into a different file format through a Python script provided by the speaker. This file format is required by YoloV8 and contains the same information as the binary masks. The speaker also highlights the importance of structuring the file system in a particular manner so that YoloV8 can find and use the data.

  • 00:15:00 In this section, the presenter explains the necessary directory structure needed for image segmentation with the Yolov8 custom dataset. Users need to have two directories: one called "train," which contains all training data for the semantic segmentation model, and the other called "ball," containing images that are going to be used to validate the model. Within these directories are two other directories -- "images" and "labels" -- that both have two subdirectories called "train" and "ball." The former contains all the annotations generated from the training masks. The presenter sets up the structure in pycharm and proceeds to explain how to train the model using a pre-trained Yolov8 model by Ultralytics, including the code implementation and configuration file editing.

  • 00:20:00 In this section, the speaker explains that the semantic segmentation model is trained in Google Collab due to access to a free GPU, making the process much faster. The speaker recommends uploading data, including directories and images, and the config.yaml file before beginning the process. Further, the speaker shows and explains how to execute every single cell on the Jupiter Notebook and shares the location of their data on Google Drive, encouraging the user to do the same when uploading their data. Lastly, the speaker mentions installing Ultralytics so that they can use YOLOv8 in the Notebook and emphasizes ensuring the use of Google Collab with GPU while checking the runtime type.

  • 00:25:00 In this section of the tutorial, the speaker trains the model for 10 epochs and saves the results under the segment train 2 folder. The next step is to analyze the training results to validate that everything is working fine. The easiest way to do this is by copying all the content in the directory where the results are saved to Google Drive. Once the results are downloaded onto a local computer, the speaker recommends focusing on analyzing the loss function, segmentation loss, training loss, and validation loss plots to ensure the loss function is going down. This is a good sign that things are working well, while an increasing loss function indicates a serious problem.

  • 00:30:00 In this section, the speaker talks about analyzing the loss function and validation metrics to train a good model for Yolov8 custom dataset. It is a good sign if the training and validation loss is going down, indicating a trend of improving prediction quality. However, the speaker cautions that missing annotations and other flaws in the training data can affect the prediction quality, leading to unexpected trends in the loss function plot. Therefore, the speaker advises not to go crazy analyzing these plots and to include an analysis of the predictions to assess the overall performance of the model.

  • 00:35:00 In this section, the speaker discusses the importance of analyzing the validation and evaluation results of a model trained on a dataset, which is not perfect and has errors. These errors can be addressed by using the predictions of the current model to train a new and improved one. The speaker shows the directory where the weights of the model are saved, explaining that last.pt and best.pt represent the last and best-performing models during training, respectively. Finally, the speaker recommends using less.pt due to its consideration of much more information than the best.pt model.

  • 00:40:00 In this section of the video tutorial on image segmentation with YOLOv8 custom dataset, the presenter demonstrates how to make predictions using a Python script. He starts by importing the YOLO package from Ultralytics and defines the model using less.pt from the results directory. He then specifies the image path and imports CV2 to open and read this image. The presenter gets the model results by calling the model on the image and iterates through the results to obtain the predicted mask. He resizes the mask to the size of the original image and saves it as output.png. The presenter also shows how to overlay the mask on the original image to ensure a good prediction.

  • 00:45:00 In this section, the presenter demonstrates the result of the semantic segmentation model that they trained with YOLOv8. Using an overlaid image and transparency, they show that although the mask is not 100% perfect, it is a very good result considering the errors in the data. The presenter then summarizes the entire process from annotating the data, training the model, validating it to making predictions. Finally, they encourage viewers to subscribe to their channel and check out other videos that they have published.
Image segmentation with Yolov8 custom dataset | Computer vision tutorial
Image segmentation with Yolov8 custom dataset | Computer vision tutorial
  • 2023.04.10
  • www.youtube.com
Code: https://github.com/computervisioneng/image-segmentation-yolov8Data: https://drive.google.com/drive/folders/1JvA2IvHBy2QOnAtPtoy4JbEEmtbz5dnK?usp=sharin...
 

Tensorflow Object Detection in 5 Hours with Python | Full Course with 3 Projects

Get the code Tensorflow Object Detection Python Course Code: https://github.com/nicknochnack/TFODCourse

Tensorflow Object Detection React App: https://github.com/nicknochnack/TFODApp

Tensorflow Object Detection for Raspberry Pi: https://github.com/nicknochnack/TFODRPi



Tensorflow Object Detection in 5 Hours with Python | Full Course with 3 Projects


00:00:00 - 01:00:00 The "Tensorflow Object Detection in 5 Hours with Python" tutorial teaches beginners in object detection how to become proficient practitioners. The comprehensive course covers the installation and setup of TensorFlow Object Detection API, collecting and labeling images, training an object detection model, and exporting it to different formats like TensorFlow.js and TF Lite using "freezing" and conversion. The course is conducted using open-source tools like Python and TensorFlow Object Detection, which enable repurposing of the code to detect various objects. Projects demonstrated include using a USB microscope to detect defects in small parts, counting the number of objects passing through a webcam, and detecting objects in a video game. The instructor emphasizes the importance of setting up a virtual environment for TensorFlow Object Detection and collecting and labeling good images with different angles, lighting conditions, and no green screen.

01:00:00 - 02:00:00 This video on TensorFlow Object Detection provides a step-by-step guide to quickly prototype, label and prepare image data, and choose a model for object detection using TensorFlow. The presenter emphasizes the importance of prototyping quickly to establish feasibility and determine the right sensor for the model. The video walks through the Label Image package and OpenCV Labeling software, as well as the process of creating an annotation file to label images for object detection. Additionally, the presenter explains the evaluation metrics used to evaluate model performance, and how to choose an appropriate model architecture for each use case by considering the trade-offs between speed and accuracy. The video concludes by demonstrating how to install TensorFlow, CUDA, and other necessary packages for object detection.02:00:00 - 03:00:00 The "Tensorflow Object Detection in 5 Hours with Python | Full Course with 3 Projects" covers the entire process of object detection in a step-by-step manner. It starts with installing Nvidia cuDNN, TensorFlow and the pre-trained model zoo, followed by creating a label map and generating tf records. This is followed by setting up the configuration file for training, troubleshooting errors, evaluating the model using TensorFlow object detection library and running the evaluation script to generate performance metrics. TensorBoard is used to view and track metrics during training and evaluation. Finally, the trained model is loaded from the checkpoint and object detection is performed on an image, followed by converting the model for use on various platforms. Performance tuning is discussed, and a demo shows how to improve performance by collecting more relevant training images for low performing classes.

03:00:00
- 04:00:00 The "Tensorflow Object Detection in 5 Hours with Python" video provides a comprehensive tutorial on building an object detection model in TensorFlow. The instructor shows how to label images, configure the pipeline, train and tune models, and test models in real time. Three projects are demonstrated, including using a microscope for object detection, building a TensorFlow.js web app, and performing Raspberry Pi sentiment detection. The video emphasizes the importance of image quality and tuning trade-offs for improving object detection performance. Viewers are encouraged to ask for clarification or provide feedback in the comments section or Discord server.

04:00:00
- 05:00:00 This video tutorial covers various aspects of Tensorflow object detection, including how to set and reset labels, labeling images for use in the model, updating the label map, training the model, evaluating and improving its accuracy, exporting and converting it to tensorflow.js, and running it on a Raspberry Pi. The instructor also provides guidance on debugging the app and manipulating the result array. Throughout the tutorial, the instructor emphasizes the importance of case sensitivity and shows how to organize and move image directories, create custom models, and label collected images. The tutorial concludes by encouraging viewers to share their ideas on running multiple models on different devices.

05:00:00 - 05:25:00 In this full course on TensorFlow object detection with Python, the instructor discusses common errors and solutions when training model applications like sentiment analysis on GPU machines. The course covers the building and installation of a Raspberry Pi detection model using TensorFlow object detection, installation of necessary dependencies like OpenCV Python, TensorFlow Lite runtime, and the detection of custom objects through adjusting the threshold and improving model performance by adding more images and training for longer. The course concludes by emphasizing the power of performance tuning and encouraging viewers to seek assistance when needed.


Part 1

  • 00:00:00 In this section, the instructor introduces the object detection course which is intended to take complete beginners in object detection to a proficient level as practitioners. The course is comprehensive and covers the installation and set up of TensorFlow object detection API, collecting and labeling of images, training the model, exporting detection model to different formats, and testing the models on various use cases. The course is all done using open-source tools, mainly Python and TensorFlow object detection, making it easy to repurpose the code to detect various objects. The instructor also explains the benefits of using the TensorFlow object detection API, which speeds up the process of building object detection models due to the various additional capabilities it provides.

  • 00:05:00 In this section, the instructor explains the difference between traditional programming and machine learning, where instead of defining explicit rules to detect objects, machine learning uses data and answers to train an object detection model to learn the rules. The model can detect objects in static images and real-time video feeds, and can be exported and converted into different formats like tensorflow.js and tf lite using freezing and conversion. Performance tuning is crucial to improving the accuracy of the model, and the instructor covers this topic in step six. Finally, the instructor demonstrates three projects, including using a USB microscope to detect defects in small parts, detecting objects in a video game, and counting the number of objects passing through a webcam.

  • 00:10:00 In this section, the instructor introduces the three projects that will be covered in the tutorial, which involve training an object detection model to identify defects in printed circuit boards, detecting hand gestures and integrating the model into a web app, and conducting real-time object detection using a Raspberry Pi. The instructor provides three sets of code on Github for each of the projects, which will be used to build the models step by step. The first step is to clone the code and create a virtual environment, which is explained as being like a separate room in a house. This section also includes helpful tips for following the tutorial, such as reviewing each step before moving on, taking notes, and asking for help if needed through the provided Discord server.

  • 00:15:00 In this section of the video, the instructor explains the importance of using a virtual environment while working with TensorFlow object detection models. He describes it as creating a new room in a house where all object detection dependencies are installed to ensure that they do not conflict with previously installed libraries. The instructor then talks about the dependencies required to work with TensorFlow, including Visual C++ Build Tools, and CUDA and CUDNN for GPU-based acceleration. He explains that these libraries are essential for faster training of object detection models and shows how to download and install them. Overall, this section provides an essential foundation necessary to set up a development environment for creating TensorFlow object detection models.

  • 00:20:00 In this section of the video tutorial, the instructor discusses the necessary requirements for installing TensorFlow Object Detection, including the specific versions of Python, GCC compiler, Bazel, CUDA, and cuDNN that are needed. The instructor emphasizes that matching the versions of TensorFlow, CUDA, and cuDNN is essential for utilizing GPUs during training. They provide links to installation guides for Windows and Linux/macOS and advise viewers to check their TensorFlow version and obtain appropriate CUDA and cuDNN versions. Additionally, the instructor mentions that they have created an error guide for common issues which occurs during installation or while running the object detection notebook. The guide includes solutions to different errors and can be found in the GitHub repository link provided.

  • 00:25:00 In this section, the instructor shows how to set up a virtual environment for TensorFlow object detection using Python. He indicates the steps to follow in order to create the environment, which involves cloning a repository with the necessary code and dependencies, creating a virtual environment using Python's venv package, and activating the environment. Creating a virtual environment is similar to creating a new room in a house, where one can isolate libraries and dependencies to work on TensorFlow object detection projects in an isolated environment.

  • 00:30:00 In this section, the video creator discusses the process of creating a virtual environment for TensorFlow Object Detection (TFOD) in order to keep the training, app, and Raspberry Pi code separate. The video creator goes through the steps to activate the virtual environment, including running a command on a Windows machine to activate the environment. There is a demonstration of updating the pip installer and then installing TensorFlow. The benefit of these updates is the ability to see real-time installations inside of the site packages folder. It is noted that sometimes installing TensorFlow requires multiple attempts, and the video creator has a verification script to ensure everything is correctly installed.

  • 00:35:00 In this section, the speaker explains how to install the necessary packages and set up a virtual environment with Jupyter Notebook using iPy Kernel. They also provide instructions for installing Anaconda Python and Jupyter Notebooks for those who are new to it. Then they explain how to install the iPy Kernel and associate it with the Jupyter Notebook to enable more efficient data science workload. Lastly, the speaker shows how to validate the virtual environment and start collecting and labeling images as part of the next step towards building a TensorFlow object detection model.

  • 00:40:00 In this section, the instructor emphasizes the importance of collecting and labeling good images when training object detection models. To demonstrate this, he uses a Jupyter notebook to collect images from a webcam and the LabelImg library to label the images. The instructor shares best practices to keep in mind when labeling, such as keeping labels as tight as possible and including pictures of objects at different angles and under different lighting conditions. Another important best practice is to start with 10 to 20 pictures of each class, but the instructor starts with less and shows how to add additional pictures to retrain the model if the performance is suboptimal. Finally, the instructor explains that providing the model with a variety of images without a green screen behind the object will make it more generalized and better able to detect objects.

  • 00:45:00 In this section, the instructor guides the viewers through the image collection component of the TensorFlow object detection tutorial. The tutorial begins by installing OpenCV, a popular computer vision library used to access the webcam and collect images. The instructor emphasizes the importance of ensuring that the dependencies are installing in the virtual environment to maintain control and ensure smooth operation. The code then imports necessary dependencies including UUID, OS, and time, which will be used to ensure compliance with different file paths and capture images at different angles. The tutorial then defines the different labels to collect and explains that the labels can be changed based on the desired objects to detect. The labels are stored in a list, and the instructor explains that the number of images collected for each label is defined by the variable "number of images".

  • 00:50:00 In this section, the instructor discusses setting up the folder structure for object detection, the importance of having a good number of images when getting started, and how to capture images using a webcam or other capture device. The folder structure is created using a specific file path that should be followed throughout the tutorial. The code loops through all the labels, connects to the webcam, and captures images for each label. The instructor emphasizes the importance of correctly specifying the video capture device number for the webcam, as errors can arise if this is incorrect.

  • 00:55:00 In this section, the instructor explains the code for collecting images for object detection. The code loops through the labels and captures the frames using the webcam. It creates a new image and places it inside the respective label folder with a unique identifier name. The cv2.iamwrite command is used to write the image to a file and show it on the screen. The code also allows the user to hit the q key to escape at any point and closes everything down when finished. The instructor emphasizes that it's essential to capture images in different positions for building a generalized model.

Part 2

  • 01:00:00 In this section, the speaker emphasizes the importance of prototyping as quickly as possible for science projects to establish the feasibility and possibility of the project. To illustrate, the speaker gives an example of detecting bent LEDs, which would require a hyper-accurate model or a sensor such as a microscope. Prototyping quickly with a mobile net model and a webcam can show poor performance, which indicates the need for a different technique or sensor. Prototyping quickly helps establish a baseline for improvement, and the speaker proceeds to train the model with the collected images for thumbs up detection, while emphasizing the importance of deleting low-quality images and capturing more good-quality images for better detection.

  • 01:05:00 In this section, the instructor introduces the Label Image package which will be used to label the images. Users can go to the Github repository to check the package. After that, the instructor shows how to install the required dependencies, pi qt5 and lxml, and create a new folder called tensorflow and label image to save the package. Then, the instructor demonstrates how to clone the Github repository and install the label image package by running specific commands. Finally, the instructor opens up the label image application and prepares to label the images.

  • 01:10:00 In this section, the instructor demonstrates the process of labeling images for object detection using the OpenCV Labeling software. The process includes opening the specific directory in the software and selecting the images to be labeled. The instructor advises being as precise as possible with the detection labels and ensuring that the label names match those in the annotations and in the training step. In addition, the instructor encourages viewers to think of different applications of object detection and try them out using the same process of defining labels, collecting images, and labeling them. The importance of ensuring that the corresponding annotations exist for each image is also emphasized.

  • 01:15:00 In this section, the instructor explains how to label an image by creating an annotation file. The annotation file is an xml file that describes the object's location in the image. The instructor shows how to label an image with multiple objects and explains how to save the annotations properly. The process of labeling images can be scaled up by outsourcing or by using newer tools that automate the labeling process using lightweight object detection models.

  • 01:20:00 In this section, the video instructor shows how to label images using label image and how to prototype to ensure you are using the right technique before building a huge object detection model with thousands of images. He recommends starting with labeling 10-25 images to keep it super fast and using the right sensors to get the right levels of data. The instructor also explains the importance of splitting the images into a training and testing partition to train the model on a segment of images and test it on a set of images it has never seen before. He creates two new folders called train and test inside the images folder to store the training and testing images and annotations.

  • 01:25:00 In this section, we learn how to prepare the image data we collected and labeled for training our object detection model. We create two folders, training and testing, to store our images and annotations, making sure to grab both image and annotation files for each class. We then explain how to convert these into TensorFlow records to be used for training. We also cover some best practices for naming and organizing our data. Finally, we introduce the dependencies needed for training a TensorFlow object detection model and how to evaluate our model using evaluation metrics.

  • 01:30:00 In this section, the speaker discusses the evaluation metrics that you get after training your object detection model. These metrics include precision and mean average precision (MAP), which calculates the proportion of correct detections. They are calculated based on the number of true positives, false positives, true negatives, and false negatives. The speaker explains that precision and recall are very important when evaluating the model's performance. Additionally, the speaker mentions that the choice of the TensorFlow model zoo affects the accuracy and speed of the model. Different models have different architectures that result in different levels of accuracy and speed, and practitioners need to choose the appropriate model for their use case.

  • 01:35:00 In this section, the transcript discusses the trade-offs between model speed and accuracy when choosing a model architecture for TensorFlow Object Detection. The video presents different model options and their corresponding frame speeds and mean average precision scores. The presenter notes that certain architectures can be fast and highly accurate, while others may be slower but more precise. The benefits of using the TensorFlow Update Detection model API are also mentioned, such as the use of pre-processing and post-processing techniques and image augmentation. The section concludes with a discussion on how to train a model using a cloud instance if desired.

  • 01:40:00 In this section, the instructor explains the setup for the TensorFlow Object Detection model, which involves setting up the pre-trained model name and URL, as well as variables such as the generate tf record script and label map name. The folder structure for the model is also shown, with the workspace containing folders for annotations, images, models, and checkpoints. The next step is to download the TensorFlow models from the TensorFlow model zoo and install the TensorFlow Object Detection package, which the instructor walks through step by step to ensure successful installation. The process involves cloning the TensorFlow Object Detection repository and installing the necessary dependencies.

  • 01:45:00 In this section, the presenter explains how to clone and install the TensorFlow Object Detection model API. The model API can be downloaded from the TensorFlow model garden, which has a lot of documentation and open-source tutorials. The presenter then demonstrates how to navigate to the research folder, which contains all the required files to use the TensorFlow Object Detection Model API. After cloning the research folder, the presenter proceeds to explain how to install protoc and other necessary components required to successfully install the TensorFlow Object Detection model API. Finally, the presenter shows how to verify the installation by running a verification script to check if the TensorFlow Object Detection API has been successfully installed.

  • 01:50:00 In this section, the instructor goes through the process of verifying and installing Tensorflow for object detection. He explains that a verification script is used to ensure Tensorflow is installed correctly and that it can be run to check for any errors. The script is a standard Tensorflow object detection script that indicates whether or not Tensorflow is installed when run. If Tensorflow is not installed correctly, the instructor provides guidance on how to configure it, including installing additional packages such as matplotlib and yaml. The instructor assures the viewer that if they encounter errors, they can google them, as they are commonly encountered and easy to solve. Finally, he states that successfully reaching the "okay" sign at the bottom of the verification script means that Tensorflow was installed correctly.

  • 01:55:00 In this section, the video covers the installation process of TensorFlow and the necessary packages for object detection. The instructor shows how to install TensorFlow, Matplotlib, Protobuf, Pillow, and Pymel. The instructor recommends checking the installation process, as it may be different depending on your environment. The video also goes through the steps to install CUDA and CuDNN, which are required if you want to use the GPU for training. The instructor explains how to find the correct version of CUDA and CuDNN and goes through the installation process. Finally, the instructor demonstrates how to import the Object Detection API successfully into the notebook by restarting the kernel.

Part 3

  • 02:00:00 In this section, the instructor goes through the process of downloading and installing Nvidia cuDNN, which is required if you are training on a GPU. First, you need to sign up for a free Nvidia Developer account, then select download cuDNN and log in. There are two versions available, and you need to download the version compatible with your version of TensorFlow. The instructor explains how to extract the zipped file and copy the cuDNN files into their respective folders inside your CUDA installation location. After copying the files, the last step is to make sure that the CUDA folder is added to your path on a Windows machine.

  • 02:05:00 In this section of the video, the instructor explains the installation process for TensorFlow and how to test whether or not it is functioning properly. The video then moves on to the process of downloading a pre-trained object detection model from the TensorFlow model zoo, which will be used for transfer learning to detect custom objects. The instructor demonstrates how to download the model and explains the different components of the downloaded files. Finally, a label map is created to represent the different labels for the custom object detection.

  • 02:10:00 In this section, the instructor explains the creation of a label map and generating tf records. He underscores the importance of proper labeling as the label map file will be used subsequently throughout the training process. The video guides through generating a train and a test record and how to update path components for the model, specifically the fine-tune checkpoint, label map path, and input path. Finally, he highlights the relevance of the pipeline config file, which determines the architecture of the model and should be customized for specific models.

  • 02:15:00 In this section of the video, the instructor explains how to set up the configuration file for training the model. The pipeline file is copied from the pre-trained models folder and then the configuration is done through code by setting up all the different file paths along with the number of classes and setting up fine-tune checkpoint, label map path, input path, etc. Once the setup is done, the model can be trained by running the model training script with necessary arguments like pipeline.config file path, pipeline config, and the number of training steps. It is suggested to run the command in a separate command prompt outside the notebook to see the progress of the model training.

  • 02:20:00 In this section, the instructor troubleshoots errors that occur during the object detection training process. He explains that most errors are solved by installing the right library or searching for solutions online. He also notes that users should ensure they have installed the correct version of TensorFlow GPU if they want to leverage the GPU. The instructor then guides viewers through the process of uninstalling and reinstalling libraries to solve specific errors, emphasizing that this process may take some time but is necessary for a successful object detection training run. Ultimately, he shows viewers how to check that they have installed the correct libraries and how to monitor the progress of the training run.

  • 02:25:00 In this section, the instructor discusses the process of training the model and evaluating its performance metrics. Once the training is initiated, the loss metrics appear after every 100 steps, indicating that the model is getting trained correctly. The output of the training model for 2000 steps is generated, indicating that the model has been successfully trained with a final loss of 0.188. The instructor then explains how to evaluate the model using the TensorFlow object detection library and run the evaluation script, which generates performance metrics for the trained model. The average precision and recall metrics can then be viewed using TensorBoard, an interactive monitoring tool for TensorFlow.

  • 02:30:00 In this section, the instructor guides the viewers through using Tensorboard to view and track metrics during training and evaluation. By navigating to the train folder, they can view the loss metrics over time, and by going to the eval folder, they can see the evaluation metrics such as mean average precision, precision, and recall. The viewers can also see the real-time performance of the model by viewing the images and whether it correctly categorizes them, and they learn how to export the graph after freezing it.

  • 02:35:00 In this section, the trained model is loaded from the checkpoint and object detection is performed on an image. The latest checkpoint can be found in the custom model folder, which contains the pipeline and all checkpoints. Once the latest checkpoint is imported, an image can be selected and object detection can be performed using the detection function. The detection function is also used to make real-time object detections from a webcam. Despite training for only 2000 steps with not many images, the model performs well in detecting different hand gestures.

  • 02:40:00 In this section, the instructor demonstrates the performance of the object detection model on different hand gestures, such as thumbs up, thumbs down, live long, and dual hands. The model's performance varies depending on the hand gesture and distance from the camera, but it can be fine-tuned by adding more images to the training set. The section then moves on to freezing and converting the model by creating a frozen graph, and exporting it to TensorFlow.js and TFLite. The converted model can be used for a web app or for a Raspberry Pi project. The final output of the custom model is similar to the pre-trained models, with checkpoint, saved model assets, variables, and pipeline config.

  • 02:45:00 In this section, the speaker discusses the last conversion that needs to be done, which is the conversion to tf light. This is done in two parts: first, the graph is exported to a tf lite graph, and then it is run using the tf lite converter to output the exact output needed to leverage it. The speaker notes that this is the end-to-end model, and that they have frozen the graph, exported it to tfjs, and tf lite, thus completing the full walkthrough of working with the tensorflow object detection API. The speaker also goes on to discuss performance tuning and notes three key things that can be done to improve the model's performance: adding more images of a low performing class to the training data set, training the model for longer, and changing the model architecture.

  • 02:50:00 In this section of the video, the presenter demonstrates how to improve the performance of the low performing classes in the object detection model by adding more relevant training images. Rather than collecting images of every single class, the presenter collects more images of thumbs up and thumbs down, which did not perform well with different hands. The presenter imports OpenCV and updates the label map before collecting images of thumbs up and thumbs down only. The collected images are stored inside the Tensorflow workspace, where the presenter evaluates them and removes the poor quality ones. Finally, the presenter adds a few more images of thumbs down with a different hand to improve the model's performance.

  • 02:55:00 In this section of the video, the instructor runs code to capture more images for the thumb up and thumb down hand gestures. After capturing the images, the instructor checks the quality of the images and decides to keep all the thumb up images and removes some of the thumb down images that are off the screen. The instructor then opens the LabelImg tool and labels the new images again, paying attention to the case sensitivity of the label names. The instructor mentions that this step of labeling images is trial and error, and one can evaluate and see what makes the model perform better. The instructor suggests looking into auto-labeling tools to improve efficiency but advises paying attention to their performance to avoid any need for manual tweaking.

Part 4

  • 03:00:00 In this section, the instructor demonstrates how to quickly label images in order to improve the training of the TensorFlow Object Detection model. He explains that it's important to copy the labeled images into both the training and test folders in order to evaluate the accuracy of the model. Next, he shows how to train a "tuned" model by changing the custom model name in the training and detection notebook to create a new folder. He then discusses how to copy the pre-trained pipeline config into the new tuned model and make updates in the pipeline config. Finally, he configures all of the necessary paths using the code provided to create the new tf record files needed for training.

  • 03:05:00 In this section, the instructor updates the pipeline configuration which defines all the label map paths for training using the standard configuration. Updating the pipeline is an essential step as it paves the way for running the training script. The number of training steps is updated from the previous configuration to 3000, and the command is run to create the trained model. After the completion of training steps, this model is evaluated and compared to the previous model by running the code under step 7. The evaluation metrics show that the average precision increased from 0.7 to 0.834, and the detection is faster and more accurate. By adding more images, additional labels, and different positions, the instructor demonstrates improving the model and producing a better object detection model. Finally, the new model is loaded, and the code is run for real-time detection, which detects objects very quickly and accurately.

  • 03:10:00 In this section, the instructor discusses ways to improve the performance of the Tensorflow Object Detection model by changing the architecture of the model. To do this, the user needs to update the link of the pre-trained model they want to use in the detection notebook. By copying and pasting the new link and name, the user can then download the new pre-trained model into their pre-trained models folder. However, the instructor emphasizes that better performance is achieved by improving image quality, such as using different styles of images, images with different angles and training for longer periods. The instructor also shows how to train a model in Google Colab by compressing the images and archiving them before importing them into the training and detection notebook on Colab.

  • 03:15:00 In this section, the instructor guides viewers through the process of uploading images to be utilized in the TensorFlow object detection program, as well as cloning the TensorFlow models repo and installing the TensorFlow object detection software. The pre-trained model is brought in and the label map is created. The instructor then establishes a custom configuration inside the models/my_ssd_mobnet folder before training the model. The viewer is warned that real-time object detection cannot be done with a webcam using Colab but can be done with an image. As the viewer goes step-by-step through the process, they can evaluate their model to see the mean average precision and average recall.

  • 03:20:00 In this section of the video, the instructor demonstrates how to detect objects in an image using the trained model. The latest checkpoint is loaded, and the image name is specified to run the detection script. The instructor highlights that the image must be valid and located in the images folder. After running the cell, the trained object detection model identifies the object in the image. The instructor further explains how to zip up and export the files and emphasizes the need to install the object detection API when running the code locally. The section concludes by previewing the three projects that will be covered next, including using a microscope for object detection, building a TensorFlow.js web app, and performing Raspberry Pi sentiment detection.

  • 03:25:00 In this section, the instructor begins by explaining the first project, which involves using a USB microscope to detect defects in small LEDs. The microscope is treated like any other camera, and the video feed is obtained using the OpenCV library. The instructor demonstrates how to position and zoom the microscope using Python code and displays the video feed. The video feed is then used to collect training images for object detection. The goal of this project is to train a model to detect defects in the LEDs and classify them as either bent or not bent.

  • 03:30:00 In this section of the TensorFlow Object Detection full course with three projects, the instructor shows how to use OpenCV to capture images from a webcam or microscope feed. He goes over best practices for releasing the capture device and destroying any OpenCV windows, including running "cap.release" after closing any cv2 web capture. He then demonstrates how to adjust the position and focus of the microscope to capture images of LED pins, and how to create two new folders for storing images of defective and non-defective pins. Finally, he tests the webcam feed for image capture to make sure everything is working as expected.

  • 03:35:00 In this section of the video, the instructor demonstrates how to collect images of a defective pin and a non-defective pin for object detection. They position the pins and adjust the sleep time between image captures, then use the sleep function to pause between captures. The instructor also explains how to label the images using the labeling line and folder location. They encourage viewers to ask for clarification or provide feedback in the comments section or Discord server.

  • 03:40:00 In this section of the video, the instructor goes through the process of labeling images to train a custom object detection model using TensorFlow. Using a microscope, the instructor captures images of defective and non-defective pins and labels them accordingly. They then copy the images into training and testing folders before setting up their training pipeline. The instructor explains how they will create a new folder for their custom object detection model and update the label map to include the new labels. The code base used in this section is quite general-purpose, allowing for a lot of customization, and the instructor provides tips on performance tuning.

  • 03:45:00 In this section of the tutorial, the focus is on creating TensorFlow records, which allow for the training of the model. The necessary cells for downloading pre-trained models and creating label maps are discussed, but the focus is on creating TensorFLow records which will allow for training of the model. Once TensorFlow records are created, the model is updated with the new classes, and the updated configuration is saved. The model is trained for 2000 steps, and the new model is tested to detect defective and non-defective pins. The notebook is outlined, and the user is directed towards specific cells that allow for this model's development.

  • 03:50:00 In this section, the instructor updates the image to detect a defective pin and a non-defective pin. The system successfully detects both pins with high confidence. However, the instructor notices that the system does not perform well with the defective pin in a specific position, highlighting a tuning trade-off. The instructor shows that the detection frame has parameters that can be modified to determine the level of detection accuracy. By reducing the minimum score threshold, the system can detect the defective pin reasonably well, but at the cost of detecting many other things as defective. The instructor tests the system with different color pins, including one that was not trained on, and at an angle, showing the system's robustness.

  • 03:55:00 In this section, the instructor demonstrates the results of the first project while using a microscope as an object detection method. He points out the detected pins and how they are classified as defective or non-defective by changing the angles and positions of the pins. He summarizes the process of the first project, which involves capturing and labeling images, setting up a microscope folder, and training the model. The next project focuses on building a web app for detecting the direction of an object. This involves combining code from the TensorFlow Object Detection Course with the TensorFlow Object Detection App Github Repository. The objective is to detect up-down-left-right movement that can be deployed as a website in the future.

Part 5

  • 04:00:00 In this section of the video, the instructor explains how to set and reset the labels for Tensorflow object detection. Instead of defining labels such as thumbs up or down, or defective and non-defective pins, they demonstrate how to define directional movements such as left, right, up, and down instead. They then show how to create folders for the different directions and collect images to use for training. The instructor also explains how to label the images using LabelImg, paying attention to image cases, and suggests performance tuning in case of lackluster results. Finally, they move on to demonstrate how to label images for the right direction.

  • 04:05:00 In this section of the video, the instructor demonstrates how to label images for use in the object detection model. Only five images are used per object/class, but the user can adjust the number as desired. The instructor advises that this is an iterative process, and it's okay to experiment with different numbers of images. Once images are labeled, they are sorted into training and testing folders, with an 80/20 split for each image/class. When the folders are set up, the instructor changes the name of the folder and creates a new file path called "directions." Finally, the instructor notes that the virtual environment is already set up and there is no need to perform any extra steps.

  • 04:10:00 In this section, the label map is updated with new labels for "left," "right," "up," and "down," and unique identifiers are assigned to each one. The annotations are then run to update the label map and generate the required train and test tfrecord files. The model is then trained using a command to generate the training script, and the loss is calculated to be 0.299. Checkpoint 3 is restored to import the dependencies for real-time detections, and the minimum score threshold is set to 90. Finally, the webcam is activated to generate real-time detections for the object detection model.

  • 04:15:00 In this section, the instructor is showing how to evaluate the model and improve its accuracy by adding more images of the particular classes that are not performing well. He demonstrates this by pointing
    to different directions using both hands and seeing how the model performs. He then moves on to freeze the graph, export it, and convert it to tensorflow.js. He also explains how to upload the model to IBM Cloud Object Store and walk through the steps that are available in the github repository. The instructor then goes on to clone the repository, open a new command prompt, and shows how to navigate to the proper folders.

  • 04:20:00 In this section of the video, the instructor is giving a step-by-step walkthrough of how to prepare a development environment for using TensorFlow object detection for image recognition. The instructor first shows how to navigate to the necessary folders and clone the TensorFlow object detection repository. Then, the instructor shows how to install Node.js and the dependencies for the repository. Finally, the instructor demonstrates how to create a new cloud object storage bucket on IBM Cloud and how to name it. This process is required for using the web-based user interface in later sections of the video.

  • 04:25:00 In this section, the instructor continues with step five in the tutorial, which is to create a bucket in the IBM Cloud Object Storage and upload the model.json and .bin files into it. Next, he enables a public access policy, allowing the model to be accessed from any location, and shows how to grab the URL to the model.json file for later use in the app. He then walks through the process of enabling Cross-Origin Resource Sharing (CORS) by running CLI commands after installing the IBM Cloud Object Storage CLI.

  • 04:30:00 In this section, the video demonstrates how to install the IBM Cloud CLI and IBM Cloud Object Storage CLI via a command line interface, enabling cross-origin resource sharing (CORS), which allows a web app to access a URL from somewhere else. The command also installs the IBM Cloud Storage Client and configures it with a configuration file. The video shows how to substitute the name of a user's bucket into a command that puts the cause policy into the bucket's root folder. Finally, the video explains that the cause policy is contained within the cloud config file and is part of the cloned repository, meaning no file creation is necessary.

  • 04:35:00 In this section, the instructor explains how to update the necessary files to run the TensorFlow.js web app. The steps include setting the Google Cloud Storage credentials, updating the model URL in app.js, updating the label map in utilities.js, and starting the app using npm start. The instructor also explains how to debug the app if it's not working, including changing the confidence metric and checking the order of objects in the detection result array. Overall, these steps allow users to customize the web app based on their specific use case and ensure it's running properly.

  • 04:40:00 In this section, the video shows how to work out where objects are and how to manipulate the result array. By inspecting the code, the user can identify that the result array contains five values, and each array has different classes that are available for a particular app. The video further guides the user on how to correctly use each object array, such as object three, which is non-post process box classes represented by very low numbers. The video then shows how to change the confidence metric and score values and adjust the box value, which the user can swiftly do by changing the values of the boxes array.

  • 04:45:00 In this section, the video wraps up the second project and moves on to the third project which involves running object detection models on a Raspberry Pi. The speaker notes that running object detection models on the Raspberry Pi is different from the previous two projects due to the lack of a GPU, and therefore they will specifically convert the model to a Tensorflow lite format. They will collect images of happy and sad faces and train their model using the same process as before. However, they will be using the Raspberry Pi for this project and will remote desktop into it for the detections. The video comes to an end by mentioning the possibility of running multiple different models on different devices and encourages viewers to share their ideas on this topic.

  • 04:50:00 In this section of the video, the instructor demonstrates how to gather and label images for a sentiment detector that recognizes happy and sad faces. The process involves deleting existing images, creating paths for new images, collecting five images of each emotion, labeling the images, and saving each object. The instructor reminds viewers that case sensitivity is crucial in labeling and that green screens are not necessary for this exercise. Additionally, the instructor notes that users can always return to the image directory to update an image label.

  • 04:55:00 In this section, the instructor shows how to organize and move the collected images and annotations into the training and testing folders. They demonstrate how to create a new custom model and label the collected images as happy and sad by updating the label map. They then create tf records and copy the pre-trained model config to the custom model folder. After updating the config, they train the model using the generated command, which runs in the virtual environment, and wait for the process to finish.

Part 6

  • 05:00:00 In this section, the instructor talks about a common error that might occur during the training run of the sentiment model when working on a GPU machine. The error message indicates that the GPU is completely consumed, causing the system to have a bad time. To solve this issue, users need to stop anything that might use the GPU at any point in time and then kick off the training run again. Once the training is complete, users can import the operating system and these paths and check the latest checkpoint as well. The instructor also demonstrates the use of the sentiment model in real-time, which is a quick and easy process once the system is set up.

  • 05:05:00 In this section, the instructor explains the steps needed to build a Raspberry Pi detection model using TensorFlow object detection. The first step is to create the TfLite files by going through the TensorFlow object detection course. Next, users should clone the Raspberry Pi detection repository, download it to their PC, or clone it from the Raspberry Pi. Once this is done, users should install the required dependencies, copy their converted model over, and run it using the existing detect.tf lite model. The instructor also demonstrates how to remote desktop into the Raspberry Pi using xrdp, making the process easier.

  • 05:10:00 In this section, the instructor walks through the process of installing all the necessary dependencies for working with TensorFlow's Object Detection API on a Raspberry Pi. The instructor demonstrates how to install OpenCV Python, as well as several other dependencies required for OpenCV to work on the Raspberry Pi. Following that, the instructor installs the TensorFlow Lite runtime, which is necessary for running TensorFlow Lite models on the Raspberry Pi. Once all the dependencies are installed, they run a sample object detection model on the Raspberry Pi to show how it works. Finally, the instructor explains how to add a custom model and labels to the system.

  • 05:15:00 In this section, the instructor discusses custom object detection and how to adjust detection threshold to obtain better results. The custom object detection includes two labels; happy and sad. The instructor adjusts the order of the labels to happy as the first object and sad as the second object. By running the custom model, it doesn't detect any objects, and the instructor decides to lower the detection threshold to 20%. After rerunning the command, the model detects objects; one big sad and a happy object. The instructor highlights the need to train the model with more images to obtain better performance results. The instructor performs additional performance tuning by adding more images of the performing classes and training the model for longer.

  • 05:20:00 In this section, the speaker explains how he improved the performance of his sentiment analysis model. Firstly, he added additional images of himself in different angles, which increases the chances of the model generalizing well. Secondly, he trained his model for 5600 more steps, giving him a brand new model with a greater index checkpoint. He then went and converted that new model into a frozen tf-lite graph and brought it into Raspberry Pi, which he used to detect whether or not he was happy or sad. The speaker demonstrates that the green screen being up or down didn't impact the performance of his model, and it was able to correctly detect when he was happy or sad. The speaker emphasizes the power of performance tuning and how it can be used to improve the accuracy of the model.

  • 05:25:00 This final excerpt concludes the Tensorflow Object Detection in 5 Hours with Python course, with the instructor reminding viewers to reach out in the comments or through the Discord channel if they need further assistance. The course is beneficial to viewers who want to learn about object detection and want an in-depth knowledge of Tensorflow, as it covers three projects within the course itself.
Tensorflow Object Detection in 5 Hours with Python | Full Course with 3 Projects
Tensorflow Object Detection in 5 Hours with Python | Full Course with 3 Projects
  • 2021.04.09
  • www.youtube.com
Want to get up to speed on AI powered Object Detection but not sure where to start?Want to start building your own deep learning Object Detection models?Need...
 

Automatic Number Plate Recognition using Tensorflow and EasyOCR Full Course in 2 Hours | Python

Final Notebook: https://github.com/nicknochnack/RealTimeAutomaticNumberPlateRecognition

Baseline Code: https://github.com/nicknochnack/TFODCourse



Automatic Number Plate Recognition using Tensorflow and EasyOCR Full Course in 2 Hours | Python

The YouTube video titled "Automatic Number Plate Recognition using Tensorflow and EasyOCR Full Course in 2 Hours | Python" provides a comprehensive guide to building an accurate and efficient automatic number plate recognition (ANPR) system using Tensorflow and EasyOCR. The ANPR system utilizes a two-part system, with TensorFlow being used to detect the license plate region of interest, while EasyOCR extracts the text from the detected region. The video covers a range of topics, from setting up virtual environments and installing necessary dependencies, to preparing data for training and troubleshooting errors that may occur during training. Overall, viewers can gain a detailed understanding of how to build a production-grade ANPR system with real-time or standard image detection capabilities.

The YouTube tutorial on Automatic Number Plate Recognition (ANPR) with TensorFlow and EasyOCR covers how to detect license plates accurately using object detection and how to extract the plate numbers using OCR. The instructor discusses how to filter the text based on size and coordinates to extract only the relevant information from the plate. They demonstrate how to allocate GPU resources and limit memory consumption for TensorFlow, extract image scores, classes, and boxes that pass the detection threshold, and apply OCR using EasyOCR. Additionally, the video covers how to save the output results to a CSV file and folder path for both real-time and image results. The speaker emphasizes that the code is available for the viewers and encourages them to ask for help and share their feedback.

  • 00:00:00 system is the OCR or optical character recognition. This step involves using a library called EasyOCR to extract the text from the detected license plate feed, allowing us to read and analyze the text from license plates in real time or on standard images. Additionally, the course also covers building a more advanced system that mimics production grade pre-built AMPR systems by saving and logging previously detected plates and their regions of interest. Overall, the course offers a comprehensive guide to building an accurate and efficient automatic number plate recognition system using Tensorflow and EasyOCR.

  • 00:05:00 this section, the presenter explains the two-part system used for automatic license plate recognition. The first part uses the TensorFlow object detection model to detect the region of interest which is the license plate. The second part uses EasyOCR to extract the text from the detected region. The presenter demonstrates how to install the necessary components including TensorFlow and EasyOCR and provides a link to the GitHub repository containing the required files. The presenter explains how to clone the repository and shows the contents of the folder that is created as a result. The folder contains a notebook file that is used for training and detection of the license plate.

  • 00:10:00 In this section of the video, the presenter goes through the steps to set up a virtual environment and install the necessary dependencies to run Automatic Number Plate Recognition (ANPR) using TensorFlow and EasyOCR. The presenter first clones down the ANPR repository and creates a virtual environment with Python. He then activates the virtual environment and installs important dependencies such as ipi kernel and upgrades pip. Through these steps, he demonstrates how to associate the virtual environment to a Jupyter notebook to leverage the full power of ANPR.

  • 00:15:00 In this section, the instructor walks the viewer through the process of associating their virtual environment to their Jupyter Notebook using the "ampr sys" environment as an example. The instructor notes that this is important because often, people that are working with virtual environments will install everything into their environment, but when they go inside their Jupyter Notebook, it's still saying that it's not there. Once the virtual environment is associated with the Jupyter Notebook, the instructor then guides the viewer through the process of installing the TensorFlow Object Detection API and setting up the folder structure to be able to perform object detection using TensorFlow.

  • 00:20:00 In this section, the instructor begins by explaining the concept of transfer learning, whereby an already-existing state-of-the-art model can be fine-tuned for a specific use case, such as detecting license plates. The instructor then proceeds to download the pre-trained models from the TensorFlow Model Zoo and install the TensorFlow Object Detection API by running several cells of code. The API and its components, such as the object detection folder, are cloned into the TensorFlow model garden repository, and the installation of the TensorFlow Object Detection model is initiated. The installation involves several dependencies and protoc, the protocol buffers system, and may take some time to complete.

  • 00:25:00 In this section, the instructor walks through installing TensorFlow and its GPU version. They install TensorFlow 2.4.1 and TensorFlow GPU 2.4.1, which is only compatible with Nvidia GPUs. They show how to verify the installation is successful by running a verification script, which should output "Ok" at the end. The instructor also demonstrates how to solve any "module not found" errors that may arise during the installation process, such as installing Matplotlib and Pillow libraries using pip. The verification script ensures that all necessary dependencies are installed before moving on to the next section.

  • 00:30:00 In this section of the video, the presenter walks through the installation of various modules necessary for building an object detection model, specifically for license plate recognition. They encounter several errors during the installation process, but they walk through how to resolve them. Once all the necessary modules, including TensorFlow Object Detection and EasyOCR, are installed, the presenter moves on to the next step, working with data, and introduces a Kaggle dataset that contains car images and annotations for license plates. They demonstrate how to download and prepare the dataset for use in the model.

  • 00:35:00 In this section, the instructor discusses how to download the necessary data for Automatic Number Plate Recognition (ANPR) and organize it into training and testing sets for object detection. The data, which contains annotations and corresponding images, is downloaded from Kaggle and unzipped into the ANPR folder. Two new folders are created within the images folder, one for training and one for testing, and the downloaded images are copied into the train folder. The annotations are also split up into training and testing sets, with the training annotations being placed in the train folder alongside their corresponding images. The purpose of these separate sets is to train the object detection model on one set of data and test it on an independent set to evaluate its performance.

  • 00:40:00 In this section, the video discusses the process of preparing the data for training a license plate detection model. The presenter explains that they downloaded license plate images from Kaggle and copied the corresponding annotations into two new folders, "train" and "test," within their TensorFlow workspace. The next step is to train the object detection model, which involves updating the labels, creating tf records, preparing the configuration, and finally, training the model. The presenter uses a "label map" to identify the one label they will use, "license," and then creates a tf record, which is the format required by the object detection model. The process is described as being straightforward, and a script to generate the tf record is provided.

  • 00:45:00 In this section, the video covers how to prepare the data for training with TensorFlow and EasyOCR. The script discussed here converts the raw data, including images and annotations, into a tf record format, but it needs to be updated due to the slightly different format of the annotations in the Kaggle data set. The video shows how to fix the errors that arise due to this discrepancy, including installing missing modules, changing the index to pick up the correct bounding box metrics, and copying the pre-trained model configuration to the training folder.

  • 00:50:00 In this section, the speaker goes over the details of the pre-trained model pipeline, which contains different lines of configurations that need to be updated before training the model. The speaker demonstrates how to copy the updated config file that ensures the parameters are updated and running the training command to train the model. They also mention the importance of running the command externally to see the progress properly and how to activate a virtual environment to run the command. Finally, they explain how to solve a standard error that may occur while installing the model.

  • 00:55:00 In this section of the course, the instructor troubleshoots errors that can occur during training. He demonstrates how to solve specific problems such as a value error related to a numpy dot nd array size change, a module not found error for cv2, a module not found error for tensorflow add-ons, and a module not found error for gin. The instructor explains that these errors are not uncommon, and it is essential to install the required libraries to solve them. Once the model is trained, the instructor shows how to read the loss metrics and how to find the most recent trained model checkpoint.

  • 01:00:00 In this section of the video tutorial on Automatic Number Plate Recognition using Tensorflow and EasyOCR, the instructor covers how to detect license plates from images or in real-time using the same model. By changing the image file name, the model can accurately detect and classify different types of plates. This method is found to perform much better through object detection than leveraging traditional computer vision techniques as the plates' formats can vary. The model can also detect plates from a live video stream or by holding up a phone in front of the car's plate. The instructor emphasizes that the method is efficient and is a good start for building a license plate detector, and the next step would be to apply OCR components to extract meaningful data from the image.

  • 01:05:00 In this section, the video tutorial discusses how to use the open source library EasyOCR to detect and extract text from an image in order to identify a license plate number. The tutorial notes that while there are more accurate OCR models available, EasyOCR is used as it runs on PyTorch, requiring a bit of room to be left for the GPU to be able to run it along with TensorFlow without consuming all of the GPU memory. The tutorial shows how to allocate GPU resources and limit memory consumption for TensorFlow in order to allow EasyOCR to run smoothly.

  • 01:10:00 In this section, the speaker explains how to load the necessary file paths and labels for Automatic Number Plate Recognition (ANPR) using TensorFlow and EasyOCR, and how to install EasyOCR and PyTorch with CUDA acceleration. They demonstrate that after loading an example image and performing ANPR, license plate numbers and detection scores are stored in a variable called "detections", which are then used to apply OCR using EasyOCR. A detection threshold is set at 0.7, allowing only detections with scores above 0.7 to be processed further.

  • 01:15:00 In this section, the instructor explains how to extract image scores, classes, and boxes that pass the detection threshold using a few lines of code. They demonstrate how to loop through each value in the detection score array and only return scores that surpass the threshold. They also apply filters to the boxes and classes to ensure they are the correct size and format for the webcam. Finally, they grab the width and height of the image to recalculate the box coordinates and apply filtering on the image to determine the region of interest.

  • 01:20:00 In this section, the video demonstrates how to extract the region of interest from the image and apply Optical Character Recognition (OCR) to extract the plate results. First, the width and height parameters of the image are obtained by using image.shape. To extract the ROI, loop through each box in the boxes variable, which represents the coordinates of the detection, and filter out the region of interest. Finally, OCR is applied to the ROI using EasyOCR, and the results are printed out to the console. The video walks the viewer through the process of setting up the EasyOCR reader, passing through the language parameters, and parsing the image to extract the plate results.

  • 01:25:00 In this section, the presenter discusses how to effectively extract the printed plate or text from the plate using OCR (optical character recognition). They show an example image where there is extra text in addition to the plate number, which can confuse the OCR reader. To solve this problem, the presenter suggests filtering the results based on size and coordinates. They demonstrate how to write a function to do this filtering and apply it to the image to get the appropriate text. The results show that the OCR reader can accurately extract the plate number, although there is still room for improvement with the OCR model.

  • 01:30:00 In this section, the speaker introduces a new function called "filter text". This function takes three arguments: region, the OCR result, and the region threshold. The region represents the image, OCR result stands for the extracted text from EasyOCR, and the region threshold is the size threshold for detecting the license plate regions. The function loops through the OCR results, extracts the length and width of each, multiplies them, and then compares them to the region threshold. If it passes this filter, it is considered text, and its region is stored in a variable called "plate". The speaker also explains some code portions that calculate the size of the region and illustrate how the filter works.

  • 01:35:00 In this section, the author shows how to filter out the text in the images effectively using the OCR engine. The filterText algorithm will allow the user to specify a threshold for the region, which will filter out only the text blocks that meet or exceed this threshold. They use this function to extract the components they need rather than retrieving everything from the plate. They have also created an OCR_it function that combines the OCR filter and OCR components, and it can be applied to any image upon passing the image and detections with the corresponding thresholds.

  • 01:40:00 In this section, the instructor tests out their optical character recognition (OCR) method by passing an image through their "ocr_it" function with specified detection and region thresholds. The method is able to accurately extract the license plate and its region in the image. They then demonstrate applying the OCR method in real-time to detect license plates using a try-except block and new code that calls the OCR function for every valid detection. The instructor tests the real-time detection on their phone and shows that it accurately detects and extracts the license plate text, though the results are not perfect and could use some tweaking.

  • 01:45:00 In this section of the video, the instructor explains how to save the output results of the OCR system that they have developed. They create a new function called "save_results" which takes the text, region, file name, and folder as input parameters. They then use the uuid library to generate a unique file name, and the csv library to save the output results to a csv file. They also use the cv2 library to write the image to the folder selected. Finally, they wrap up the code and test it out.

  • 01:50:00 In this section of the video, the instructor demonstrates how to set up a CSV file and folder path to write out all the image results or regions. They create a new folder called detection_images, where all the image results will be written. The instructor shows how to pass through the CSV file name and folder path, and then test the code to save the results. They set up a save results method to write out the results to the created folder, which includes the name of the image as well as the number plate. The instructor then demonstrates how to run the code in real-time to save the results while the detections are happening. Finally, they test out the code, which results in accurately outputting the images while saving the results in real-time.

  • 01:55:00 In this section, the speaker informs the viewers that the OCR function built and filtering algorithm applied in real-time is available to them. He encourages viewers to ask for help if they face any difficulty and mentions that the code will be available on GitHub for them to use. Finally, he thanks the viewers for tuning in, asks them to like, subscribe and comment and signs off.
Automatic Number Plate Recognition using Tensorflow and EasyOCR Full Course in 2 Hours | Python
Automatic Number Plate Recognition using Tensorflow and EasyOCR Full Course in 2 Hours | Python
  • 2021.05.03
  • www.youtube.com
Want to be able to perform number plate recognition in real time?Well in this course you'll learn how to do exactly that!In this video, you'll learn how to l...
Reason: