Sri Datta Budaraju

Masters Student

A Little About Me

Hi, my name is Sri Datta and I am a Masters student in Computer Science at KTH Royal Institute of Technology. I consider myself a curious person and have experimented with different areas of Computer Science, from Android Development to Deep Learning.

Like many other students, I wanted to see what this “AI” is all about in my sophomore year of bachelors and started exploring. Thanks to Andrew NG’s Machine Learning course and specialization which was then just released, I was so fascinated by the scope of its applications and I immediately packed my bags to Stockholm to pursue my masters and get really involved. I have spent the majority of my time doing interesting courses at KTH and working on an autonomous F1 car at KTH Formula Student. I am currently doing my master thesis in unsupervised 3D human pose estimation.

I am now actively looking for full time positions in deep learning, please let me know if you have any leads


KTH Royal Institute of Technology

Master's degree, Computer Science Major - Machine Learning

2018 – 2020

With its origins going back to 1827, the Royal Institute of Technology, KTH - is Sweden’s largest technical research and learning institution and one of Europe’s leading technical and engineering universities.

In the spirit to get involved in Deep Learing and Computer Vision, I moved far away from my home to Sweden to study at KTH. Having planned my 2 years with many intersting courses and projects, I wish to equip myself with amples skills and expertise to earn oppotunities to work on interesting problems in the industry.

Amrita Vishwa Vidyapeetham

Bachelor of Technology, Computer Science

2015 – 2018

Establsihed in 2003, Amrita has emerged as one of the fastest growing institutions of higher learning in India by continuously collaborating with top European universities like KTH and top US universities including Ivy league universities.

I started my bachelors with a huge goal to score high GPA and was placed in the top 3 of the department by the end of first semester. I then realised that was not really what I want. Thanks to all the faculty who motivated me to go beyond academics. I began then and never stopped. I invest humongous amount of time experimenting with literally everything I could get my hands on. Have a look at the projects I talk about in the following sections. Besides that I also took active part in Maths Club and the Toastmasters.


Mercedes Benz

Thesis Intern

June 2020 - Dec 2020

Research in unsupervised learning approaches for 3D human pose estimation using VAE-GAN hybrid generative models.
First work to introduce VAE-GAN hybrid networks to the task of 3D pose estimation and one of the few unsupervised methods.
First 2D-to-3D pose lifting approach that addresses the major challenges in scaling to the real-world. The method also outperforms the previous SOTA in the most widely used dataset in similar settings.
Thesis Publication Link

KTH Formula Student - Driverless Formula Racing

Lead Perception Engineer

October 2018 - Present

The lead of Perception – At KTH Formula Student, we practice horizontal leadership. As the lead, I’m mainly the point of contact and responsible for the perception pipeline of the driverless system. This deals with object detection using the stereo camera and the lidar. The following are my contributions to the team and the perception module in specific.
Data Collection – Setup tracks for the F1 car using color-coded traffic cones and used Velodyne VLP 16 and Zed stereo camera to collect lidar point clouds and stereo RGB images.
Object detection – Trained versions of YOLO for detections on camera and trying to improve inference speed on Xavier with TensorRT. Angle based ground removal and clustering on the point clouds.
Calibration – Have tried and tested image-based, intensity-based and topography based calibration approaches to calibrate lidar and stereo camera.

RPL - Robotics, Perception and Learning Lab at KTH

Research Engineer

Sep 2019 - Oct 2019

Research engineer at RPL in Prof. Hedvig Kjellström’s group under Taras Kucherenko. My work was to help Taras with the dataset preparation process for to train models to learn semantic gesture generation from speech inputs. This included annotation/rectification of audio transcript generated using BERT and also prototyping tools to help with data annotation.

AI without Borders


May 2019 - August 2019

Panoptic Segmentation: Studying the literature in panoptic segmentation and exploring ways to use it on medical images especially for cancer cell detection and methods for creating datasets suitable for the task


Proposal Reviewer

January 2019 - February 2019

Scrutiny: Assess the feasibility and scalability of AI proposals by companies competing in Google AIs Impact Challenge for Social Good and evaluate the candidates over various scoring metrics

Amrita Multidimensional Data Analytics Lab

Research Assistant

March 2017 – March 2018

Had the pleasure of collaborating with Dr. Vidhya Balasubramanian, Ph.D., UCI. Worked on indoor localization using Dual band WiFi routers and Low Energy Bluetooth beacons. Studied the WiFi patterns in complex indoor environments and analyzed the trends in 2.4 and 5GHz. Designed algorithms to localize users without explicit fingerprinting and developed a mobile application to provide real-time indoor mapping services.


Campus Ambassador

August 2017 - August 2018

Established partnership between Amrita and GeeksForGeeks. Organized hands-on workshops for students and trained around 200 fellow students to get started in Android development. Collaborated with the best of them on an official application for the university. Created a platform for GeeksForGeeks to take part in tech talks at Amrita as part of national tech fest - Anokha. And few other management and marketing activities


Binarized Neural Networks

Oct 2019 - Dec 2019

Project for NeurIPS19 Reproducibility challenge

Reimplementing and reproducing the paper, Latent Weights Do Not Exist - Rethinking Binarized Neural Network Optimization from scratch. The pipeline uses PyTorch and also PyTorch Lightning to ensure reproducibility. Reimplement the binary layers and the novel binary optimizer BOP. Experiments comparing the performance of BNNs with latent weights and BNN w/o latent weights using BOP. Experiments to study the effects of the thresholding and adaptivity rate hyperparameters of BOP.
PyTorch PyTorch Lightning


Sep 2019 - Dec 2019

Project is a part of the works at RPL lab at KTH

A dataset on classifying and localizing clothing in a deformed state by robotic manipulation. Novel dataset that is different from the existing fashion datasets which could be well suited for robotic applications and also has additional depth information from stereo cameras. Preparing a diverse collection of clothes across various categories. Used Baxter robot to automate the manipulation of the clothes into various states. Synchronize real sense stereo cameras - one on the body of the Baxter and one facing the robot. Manually pick frames with clothes in key states and annotate them with bounding boxes. Performed various experiments to evaluate the dataset. This includes exploring the effects of pre-trained models, incorporating DeepFashion2 and effects of complex models with FPN backbone using PyTorch PyTorch PyTorch Lightning

YT Trends

Sep 2019 - Oct 2019

Scalable pipeline for live YouTube trend visualization

Live data using YouTube Data API. Kafka broker as a producer to streams the queries. PySpark to consume the stream topic and perform desired operations. The results of PySpark operations are stored in spark-warehouse and ready for visualization. PySpark Kafka

Deep Image Colorization

April 2019 - May 2019

Colorization of B&W images using Deep Learning

Using autoencoder network to predict color given a gray scale image i.e pridicts a,b channel given l channel in ‘lab’ color space. Leverages the power of pre-trained networks to effectively extract important low level features from the input gray scale image. These low level features are then fused with the low level features from the encoder and provided as the input for the decoder.
Keras ImageNet GCP

Classical Piano Composer Recognition

March 2019 - June 2019

Piano music composer recognition from audio clips

Exploring the relations between the composers and compositions using Mel-scaled spectrogram and leveraging the power of convolutions to learn the low-level features and the temporal information using memory units like LSTMs and GRUs. This enables our models to not only understand the notes but also to learn the melody.
Keras TensorFlow Magenta

Swedish Traffic Sign Detection for Autonomous Drone

February 2019 - May 2019

Swedish Traffic Sign Detection in ROS

Collected training data of 15 Swedish traffic signs from drone footage and integrated the trained YoLo model in ROS for detections on drone’s live feed. Extracted edges of signs and used Perspective-n-Point to estimate their 3D position Darknet Python ROS Google Colabs

Wav2Letter++ on Swedish Speech

February 2019 - March 2019

Facebook Wav2Letter++ ASR on Swedish Speech

Pre-processed NST’s Acoustic database for Swedish, to suit the Wav2Letter architecture, generating Language model and tokens and tuning hyper parameters to study the working of W2L on Swedish language.

W2L Python NST

Accident Anticipation

April 2018 - July 2018

Real time accident detection in video feeds using Hierarchical Recurrent Neural Networks with LSTM cells. Trained on hand sampled clips from YouTube. Part of my bachelor thesis.

Built a deep learning model to detect car accidents in video clips –Implemented and tweaked YOLOv2 to detect cars in highway cams –Wrote scripts to scrape video clips from Youtube –Collected a data set of 200+ accident/non-accident videos –Preprocessed the data-set to lessen the burden on the machine –Explored pre trained models on the internet with the gathered videos –Experimented with a few DL models for classification –Implemented HRNN with LSTM to classify the video clips

Python Keras Google Colabs

November 2017 - March 2018

Projects as a part of the Deep Learning Specialization

Major Projects are –Logistic Regression for Image classification –Deep Neural Network for Cat or Not a Cat classification –Convolutional Neural Network using TensorFlow for Hand Sign Recognition –Happy House using Keras for Facial Expression Classification –Residual Networks for Hand Sign Recognition –Autonomous Driving Application, Car detection using YOLO on dataset –Face Recognition for the Happy House using FaceNet Architecture and OpenFace Model –Art Generation with Neural Style Transfer using ImageNet VGG 16 Very Deep ConvNet –Dinosaurus Name Generation using Character level language model –Shakespearian poem generator using RNN LSTM in Keras –Music Generation, Jazz Solo with an LSTM Network

Python Keras TensorFlow

Safe Rider - Road Assistant

October 2017 - January 2018

Road safety application which warns users of road hazards in real-time. Detects hazards using on-board IMU sensors and Crowd-Sources the data to alert other users. Built the whole working application during a 24-hour hackathon.

Developed road safety application which warns users of road hazards in real-time –Implemented algorithms to detect road hazards using IMU sensors –Analyzed the IMU sensor trends from a bike driven on a bad road –Crowd-Sourced the detections to Firebase Cloud Database –Integrated voice assistance to warn the user if approaching the detection –Built the whole working app during a 24-hour hackathon conducted by –Internet and Mobile Association of India in which we won the 1st place –Extended project doing experiments after the hackathon

Java Android Studio Google Maps Navigations APIs Firebase


September 2017 - October 2017

Sentimental Analysis tool to analyze real-time trends of keywords in Twitter. Classified live tweets from twitter API using ensemble modelling and implemented live graphical visualizations.

A Sentimental Analysis tool to analyze real-time trends of keywords in Twitter –Fetched Positive and Negative example sentences from online tutorials –Used Twitter API to fetch live tweets based on the keyword passed –Wrote functions to pass just adjective from the tweets into the classifiers –Classified the tweet based on the confidence of the classifiers put together –Built an interface window for the tool using TKinter –Integrated live graphical visualizations of the sentiments of the keyword

Python NLTK Sci-Kit Tweepy TKinter

Indoor Localization and Navigation

March 2017 - March 2018

Indoor position system that localizes users using dual band WiFi routers and Low Energy Bluetooth Beacons without any manual fingerprinting of the environment.

Implemented and tweaked YOLOv2 to detect cars in highway cams –Wrote scripts to scrape video clips from Youtube –Collected a data set of 200+ accident/non-accident videos –Preprocessed the data-set to lessen the burden on the machine –Explored pre trained models on the internet with the gathered videos –Experimented with a few DL models for classification –Implemented HRNN with LSTM to classify the video clips

Python Java Android Studio


May 2017 - May 2017

One of my most simple but elegant projects that I did so far. Wish I had some demos.

Built an object detection and avoidance robot using Lego Mindstorms –Integrated LEGO brick, IR, Supersonic, touch, sound sensors to achieve required functions –Course correction, React to and search for sound sources like claps

Lego Mindstorms Lego NXT


January 2017 - April 2017

An emergency train stopping cum Surveillance system designed for Indian Railways. Awarded as top 30 ideas by Amrita TBI- India's best Technological Business Incubator.

Built an emergency train stopping cum Surveillance system –Devised an emergency button-Cam module using RPi –The RPi module alerts the onboard authority when a button is triggered –The data is then broadcasted to authority’s device using a local network –Created by hot-spot to deal with network issues in remote areas –This data is then uploaded from the device to the cloud for official purposes –The project was placed in the top 30 ideas by Amrita TBI – India’s best Technological Business Incubator

Python Java Android Studio Raspberry Pi Arduino


November 2017 - March 2018

An andorid app that I built along with 2 of my juniors whom I trained as a part of a workshop that I conducted.

Full-Flegded Portal for faculty of Amrita helping them in their everyday tasks. –Interactive Timetable management –Profile Management–Private and Group Chat –Completly cloud based–Zero need for on device storage –Multiple Authentication option –Networking for task management between users and many more interesting features –Material design UI

Google Cloud Firebase Firestore Database Material Design Java Android Studio SQL

Book Crawler

November 2016 - January 2017

My very first project that I was so passionate about. This is one of the reasons I had got into indoor localisation and worked in AMUDA labs.

An android application for visualizing the location of a book in the library –Uses the information in the university database and maps the location

Java Android Studio SQL

A Little More About Me

Alongside my interests in Deep Learning and Autonomous Systems, some of my other interests and hobbies are:

  • Watching Movies (a lot!)
  • Listening to Music
  • Admiring the beauty of nature