A Little About Me
Hi, my name is Sri Datta and I am a Masters student in Computer Science at KTH Royal Institute of Technology. I consider myself a curious person and have experimented with different areas of Computer Science, from Android Development to Deep Learning.
Like many other students, I wanted to see what this “AI” is all about in my sophomore year of bachelors and started exploring. Thanks to Andrew NG’s Machine Learning course and DeepLearning.ai specialization which was then just released, I was so fascinated by the scope of its applications and I immediately packed my bags to Stockholm to pursue my masters and get really involved. I have spent the majority of my time doing interesting courses at KTH and working on an autonomous F1 car at KTH Formula Student. I am currently doing my master thesis in unsupervised 3D human pose estimation.
I am now actively looking for full time positions in deep learning, please let me know if you have any leads
KTH Royal Institute of Technology
Master's degree, Computer Science Major - Machine Learning
2018 – 2020
With its origins going back to 1827, the Royal Institute of Technology, KTH - is Sweden’s largest technical research and learning institution and one of Europe’s leading technical and engineering universities.
In the spirit to get involved in Deep Learing and Computer Vision, I moved far away from my home to Sweden to study at KTH. Having planned my 2 years with many intersting courses and projects, I wish to equip myself with amples skills and expertise to earn oppotunities to work on interesting problems in the industry.
Amrita Vishwa Vidyapeetham
Bachelor of Technology, Computer Science
2015 – 2018
Establsihed in 2003, Amrita has emerged as one of the fastest growing institutions of higher learning in India by continuously collaborating with top European universities like KTH and top US universities including Ivy league universities.
I started my bachelors with a huge goal to score high GPA and was placed in the top 3 of the department by the end of first semester. I then realised that was not really what I want. Thanks to all the faculty who motivated me to go beyond academics. I began then and never stopped. I invest humongous amount of time experimenting with literally everything I could get my hands on. Have a look at the projects I talk about in the following sections. Besides that I also took active part in Maths Club and the Toastmasters.
June 2020 - Dec 2020
Research in unsupervised learning approaches for 3D human pose estimation using VAE-GAN hybrid generative models.
First work to introduce VAE-GAN hybrid networks to the task of 3D pose estimation and one of the few unsupervised methods.
First 2D-to-3D pose lifting approach that addresses the major challenges in scaling to the real-world. The method also outperforms the previous SOTA in the most widely used dataset in similar settings.
Thesis Publication Link
Lead Perception Engineer
October 2018 - Present
The lead of Perception – At KTH Formula Student, we practice horizontal leadership. As the lead, I’m mainly the point of contact and responsible for the perception pipeline of the driverless system. This deals with object detection using the stereo camera and the lidar. The following are my contributions to the team and the perception module in specific.
Data Collection – Setup tracks for the F1 car using color-coded traffic cones and used Velodyne VLP 16 and Zed stereo camera to collect lidar point clouds and stereo RGB images.
Object detection – Trained versions of YOLO for detections on camera and trying to improve inference speed on Xavier with TensorRT. Angle based ground removal and clustering on the point clouds.
Calibration – Have tried and tested image-based, intensity-based and topography based calibration approaches to calibrate lidar and stereo camera.
Sep 2019 - Oct 2019
Research engineer at RPL in Prof. Hedvig Kjellström’s group under Taras Kucherenko. My work was to help Taras with the dataset preparation process for to train models to learn semantic gesture generation from speech inputs. This included annotation/rectification of audio transcript generated using BERT and also prototyping tools to help with data annotation.
May 2019 - August 2019
Panoptic Segmentation: Studying the literature in panoptic segmentation and exploring ways to use it on medical images especially for cancer cell detection and methods for creating datasets suitable for the task
January 2019 - February 2019
Scrutiny: Assess the feasibility and scalability of AI proposals by companies competing in Google AIs Impact Challenge for Social Good and evaluate the candidates over various scoring metrics
March 2017 – March 2018
Had the pleasure of collaborating with Dr. Vidhya Balasubramanian, Ph.D., UCI. Worked on indoor localization using Dual band WiFi routers and Low Energy Bluetooth beacons. Studied the WiFi patterns in complex indoor environments and analyzed the trends in 2.4 and 5GHz. Designed algorithms to localize users without explicit fingerprinting and developed a mobile application to provide real-time indoor mapping services.
August 2017 - August 2018
Established partnership between Amrita and GeeksForGeeks. Organized hands-on workshops for students and trained around 200 fellow students to get started in Android development. Collaborated with the best of them on an official application for the university. Created a platform for GeeksForGeeks to take part in tech talks at Amrita as part of national tech fest - Anokha. And few other management and marketing activities
Project for NeurIPS19 Reproducibility challenge
Reimplementing and reproducing the paper, Latent Weights Do Not Exist - Rethinking Binarized Neural Network Optimization from scratch. The pipeline uses PyTorch and also PyTorch Lightning to ensure reproducibility. Reimplement the binary layers and the novel binary optimizer BOP. Experiments comparing the performance of BNNs with latent weights and BNN w/o latent weights using BOP. Experiments to study the effects of the thresholding and adaptivity rate hyperparameters of BOP.
PyTorch PyTorch Lightning
Project is a part of the works at RPL lab at KTH
A dataset on classifying and localizing clothing in a deformed state by robotic manipulation. Novel dataset that is different from the existing fashion datasets which could be well suited for robotic applications and also has additional depth information from stereo cameras. Preparing a diverse collection of clothes across various categories. Used Baxter robot to automate the manipulation of the clothes into various states. Synchronize real sense stereo cameras - one on the body of the Baxter and one facing the robot. Manually pick frames with clothes in key states and annotate them with bounding boxes. Performed various experiments to evaluate the dataset. This includes exploring the effects of pre-trained models, incorporating DeepFashion2 and effects of complex models with FPN backbone using PyTorch PyTorch PyTorch Lightning
Scalable pipeline for live YouTube trend visualization
Live data using YouTube Data API. Kafka broker as a producer to streams the queries. PySpark to consume the stream topic and perform desired operations. The results of PySpark operations are stored in spark-warehouse and ready for visualization. PySpark Kafka
Colorization of B&W images using Deep Learning
Using autoencoder network to predict color given a gray scale image i.e pridicts a,b channel given l channel in ‘lab’ color space. Leverages the power of pre-trained networks to effectively extract important low level features from the input gray scale image. These low level features are then fused with the low level features from the encoder and provided as the input for the decoder.
Keras ImageNet GCP
Piano music composer recognition from audio clips
Exploring the relations between the composers and compositions using Mel-scaled spectrogram and leveraging the power of convolutions to learn the low-level features and the temporal information using memory units like LSTMs and GRUs. This enables our models to not only understand the notes but also to learn the melody.
Keras TensorFlow Magenta
Swedish Traffic Sign Detection in ROS
Collected training data of 15 Swedish traffic signs from drone footage and integrated the trained YoLo model in ROS for detections on drone’s live feed. Extracted edges of signs and used Perspective-n-Point to estimate their 3D position Darknet Python ROS Google Colabs
Facebook Wav2Letter++ ASR on Swedish Speech
Pre-processed NST’s Acoustic database for Swedish, to suit the Wav2Letter architecture, generating Language model and tokens and tuning hyper parameters to study the working of W2L on Swedish language.
W2L Python NST
Real time accident detection in video feeds using Hierarchical Recurrent Neural Networks with LSTM cells. Trained on hand sampled clips from YouTube. Part of my bachelor thesis.
Built a deep learning model to detect car accidents in video clips –Implemented and tweaked YOLOv2 to detect cars in highway cams –Wrote scripts to scrape video clips from Youtube –Collected a data set of 200+ accident/non-accident videos –Preprocessed the data-set to lessen the burden on the machine –Explored pre trained models on the internet with the gathered videos –Experimented with a few DL models for classification –Implemented HRNN with LSTM to classify the video clips
Python Keras Google Colabs
Projects as a part of the Deep Learning Specialization
Major Projects are –Logistic Regression for Image classification –Deep Neural Network for Cat or Not a Cat classification –Convolutional Neural Network using TensorFlow for Hand Sign Recognition –Happy House using Keras for Facial Expression Classification –Residual Networks for Hand Sign Recognition –Autonomous Driving Application, Car detection using YOLO on Drive.ai dataset –Face Recognition for the Happy House using FaceNet Architecture and OpenFace Model –Art Generation with Neural Style Transfer using ImageNet VGG 16 Very Deep ConvNet –Dinosaurus Name Generation using Character level language model –Shakespearian poem generator using RNN LSTM in Keras –Music Generation, Jazz Solo with an LSTM Network
Python Keras TensorFlow
Road safety application which warns users of road hazards in real-time. Detects hazards using on-board IMU sensors and Crowd-Sources the data to alert other users. Built the whole working application during a 24-hour hackathon.
Developed road safety application which warns users of road hazards in real-time –Implemented algorithms to detect road hazards using IMU sensors –Analyzed the IMU sensor trends from a bike driven on a bad road –Crowd-Sourced the detections to Firebase Cloud Database –Integrated voice assistance to warn the user if approaching the detection –Built the whole working app during a 24-hour hackathon conducted by –Internet and Mobile Association of India in which we won the 1st place –Extended project doing experiments after the hackathon
Java Android Studio Google Maps Navigations APIs Firebase
Sentimental Analysis tool to analyze real-time trends of keywords in Twitter. Classified live tweets from twitter API using ensemble modelling and implemented live graphical visualizations.
A Sentimental Analysis tool to analyze real-time trends of keywords in Twitter –Fetched Positive and Negative example sentences from online tutorials –Used Twitter API to fetch live tweets based on the keyword passed –Wrote functions to pass just adjective from the tweets into the classifiers –Classified the tweet based on the confidence of the classifiers put together –Built an interface window for the tool using TKinter –Integrated live graphical visualizations of the sentiments of the keyword
Python NLTK Sci-Kit Tweepy TKinter
Indoor position system that localizes users using dual band WiFi routers and Low Energy Bluetooth Beacons without any manual fingerprinting of the environment.
Implemented and tweaked YOLOv2 to detect cars in highway cams –Wrote scripts to scrape video clips from Youtube –Collected a data set of 200+ accident/non-accident videos –Preprocessed the data-set to lessen the burden on the machine –Explored pre trained models on the internet with the gathered videos –Experimented with a few DL models for classification –Implemented HRNN with LSTM to classify the video clips
Python Java Android Studio
May 2017 - May 2017
One of my most simple but elegant projects that I did so far. Wish I had some demos.
Built an object detection and avoidance robot using Lego Mindstorms –Integrated LEGO brick, IR, Supersonic, touch, sound sensors to achieve required functions –Course correction, React to and search for sound sources like claps
Lego Mindstorms Lego NXT
An emergency train stopping cum Surveillance system designed for Indian Railways. Awarded as top 30 ideas by Amrita TBI- India's best Technological Business Incubator.
Built an emergency train stopping cum Surveillance system –Devised an emergency button-Cam module using RPi –The RPi module alerts the onboard authority when a button is triggered –The data is then broadcasted to authority’s device using a local network –Created by hot-spot to deal with network issues in remote areas –This data is then uploaded from the device to the cloud for official purposes –The project was placed in the top 30 ideas by Amrita TBI – India’s best Technological Business Incubator
Python Java Android Studio Raspberry Pi Arduino
An andorid app that I built along with 2 of my juniors whom I trained as a part of a workshop that I conducted.
Full-Flegded Portal for faculty of Amrita helping them in their everyday tasks. –Interactive Timetable management –Profile Management–Private and Group Chat –Completly cloud based–Zero need for on device storage –Multiple Authentication option –Networking for task management between users and many more interesting features –Material design UI
Google Cloud Firebase Firestore Database Material Design Java Android Studio SQL
My very first project that I was so passionate about. This is one of the reasons I had got into indoor localisation and worked in AMUDA labs.
An android application for visualizing the location of a book in the library –Uses the information in the university database and maps the location
Java Android Studio SQL
A Little More About Me
Alongside my interests in Deep Learning and Autonomous Systems, some of my other interests and hobbies are:
- Watching Movies (a lot!)
- Listening to Music
- Admiring the beauty of nature