Srijana experience of making her hands dirty with ML and Drones

Open Knowledge Nepal

Open Knowledge Nepal


Fri Jul 12 2019

This blog post is written by Srijana Raut, an Open Data Women Fellow 2019

Most of the students, including myself are eager to find a platform that matches their interest and experience.  Since my interest is on Data Science, I suggest myself to be an applicant of Open Data Women Fellowship 2019. I came to know about the fellowship through widely circulated Social Media and Women Leaders in Technology (WLiT) one of the knowledge partners of the program. I saw this as an opportunity to figure out how data science is being used in Nepal. The fellowship program with structured into training and placements. The training at 10 different host organization helps me to shape my knowledge diversity and placement gave me the opportunity to explore the working culture of an organization.

I was placed at NAXA for an internship. NAXA is a Geo-IT company based in Kathmandu, Nepal. The company holds specific expertise in digital mapping, geodata management and development of map-centric applications. The primary domain of the company is in collecting and analyzing geo-tagged data to turn them into information that can facilitate better decision making. At NAXA, I got a chance to be involved in detecting and tracking vehicles in a video taken from a drone. My task was to find an appropriate object tracking algorithm.

For tracking objects, we have been given a video which are captured through Drones. Our job was to track how a vehicle is moving, where it is going, or its speed. As the motive is to track multiple objects (vehicles) from input videos. We decided to use built-in trackers in OpenCV which is a popular computer vision library. A multiple object tracker is simply a collection of single object trackers. We start by defining a function that takes a tracker object. Some of the types of OpenCV tracker that we implement are:

  2. MIL (Multiple Instance Learning)
  3. KCF (Kernelized Correlation Filter)
  4. TLD (Tracking, Learning and Detection)
  6. MOSSE (Minimum Output Sum of Squared Error)
  7. CSRT
  8. Optical Flow

Although the built-in tracker of OpenCV is one of the best but in our scenario, it lacks the performance due to a high-resolution video were multiple vehicles are moving, a shadow in some parts, etc. Then, we implement Optical Flow Computation using Lucas-Kanade method for tracking purpose. Optical Flow is a vector field of apparent motion of pixels between frames. It works with a dense version where certain characteristics feature points are tracked. Furthermore, it is well known that gradient-based methods, such as Lucas-Kanade, are fairly accurate in producing angular estimation. It is useful for high speed, robustness by a multiplicity of viewing angles. This show improved results. 

Fig: Optical flow implementation on footage


Furthermore,  we determine the direction where the vehicle is moving. We also count the number of vehicles moving. To count the vehicles, it must be tracked across a line. This means the object must be tracked prior to crossing the line. Lines must be placed in such a way that allows the object to cross.

Fig: Showing the direction of tracked object


Fig: Counting object that passes through a line


Also, a unique ID to each tracked object is necessary for counting objects and to monitor their performance. The input set as bounding box coordinates returns the value as x, y, width and height of each object. These coordinates are saved as the previous frame and compare with the current frame. This is done by calculating the Euclidean Distance between the current frame and the previous frame for each object.


Fig: Assigning Unique ID


This however, results in id’s being swapped between two or more objects, if they come close to each other (overlapping). Thus, to overcome these problems, Intersection Over Union (IOU) – which is a metric to find the degree of overlap between two shapes was used. Using IOU we calculated the overlap ratio between an object’s bounding with all the other existing bounding boxes. If the IOU greater than 95% we assumed that it was the same object. Making us possible to assign a unique id to detected objects across multiple numbers of frames.


Fig: Multiple tracked object with the bounding box


One last thing, I got a chance to get involved in creating labeled vehicle data from the provided drone footage to train machine learning models to further improve, the precision and accuracy of object detection.

Fig: Using ViTBAT to generate labeled data from video


I used a free annotation tool named Video Tracking and Behavior Annotation Tool (ViTBAT) to create labeled video data. After, drawing bounding boxes and labeling them. ViTBAT could able be used to export them to a CSV and MAT file. The produced dataset aided to further train the machine learning model. In the coming months, the vehicle detection and tracking method used here will be expanded to swiftly generate the number of vehicles in the intersection in a given time interval, including its trajectory.  

I am so delighted to have NAXA as my host organization. The working culture, teaching and learning experience exceeded my expectation. A special thanks to Open Knowledge Nepal and NAXA Pvt. Ltd. for constant mentoring and guidance. This fellowship has defined my career path more clearly. My career goal is to work in the field of AI and Data Science. One month period feels like only a few days, with this wonderful experience, I am gladder to continue my journey as an intern in NAXA.