|
Reasoning over Incomplete Knowledge Graph via Graph Structure Learning
Guide: Prof. Yizhou Sun, UCLA
Code
Report
Formulated the idea of graph structure learning and proposed an algorithm to solve the problem of Knowledge Graph Completion.
Used K hop positive and negative edge sampling to effectively learn the graph structure. Used multi-headed attention technique to learn better node representation using CompGCN model in a relation graph setup and show its efficacy using empirical results
on FB15K-237 and WN18RR datasets
|
|
Deep Weakly-Supervised High Speed High Dynamic Range Video Generation
Guide: Prof. Shanmuganathan Raman, IIT Gandhinagar
Abstract
Video
Devised a weakly supervised deep learning framework to generate high Frame Rate High Dynamic Range video
from a sequence of low Frame Rate alternating exposure Low Dynamic Range frames.
|
|
Few Shot Class Incremental Learninig
Guide: Prof. Subhasis Chaudhuri, Prof. Biplab Banerjee, IIT Bombay
Code
Report
A novel GAN based architecture to generate pseudo prototypes for each class to avoid the catastropic forgetting in Incremental setup.
|
|
Non Stationary Bandits with Periodic Variation
Guide: Prof. D. Manjunath, IIT Bombay
Under Progress
Introduced a new setting in non-stationary bandits by considering the means of arms to vary in a periodic fashion. Proposed two new algorithms for the perfectly periodic setting, D-PUCB and SW-PUCB, relying on discounted and
sliding window approaches respectively and showed a logarithmic regret, validated by their performance on synthetic data.
|
|
Offline Voice Commanding in Microsoft Word App
Guide: Abhishek Agarwal, Microsoft R&D India
Paper
Code
Report
Developed and Integrated a Size Optimized Dynamically Downloadable Entity Recognizer and Intent Classifier Model
for enabling Offline Voice Commanding in Microsoft Word App. Work accepted at MLADS Synapse 2020
|
|
Data Generation for Person Intrusion Detection Using Human Pose Transfer
Guide : Prof. Shanmuganathan Raman, IIT Gandhinagar
Architecture
2-way GAN for Human Pose Transfer conditioned on input image and a target pose to generate a large number of fake human images in different poses and varied backgrounds
|
|
Game Theoretic Approach to Optimal Network Allocation
Guide : Prof. Prasanna Chaporkar, IIT Bombay
Code
Report
Modelled and proved the NP-Hard Optimal Network Allocation problem as an exact potential game. Graphical comparison of the convergence of potential functions of 3 Algorithms: Best Response
Dynamics(BRD), Spatial
Adaptive Play and Concurrent-SAP on a simulated randomized input to emulate real-world scenario.
|
|
OSR - Open Set Recognition using Side Information
Guide : Prof. Biplab Banerjee, IIT Bombay
Code
Learnt a Discriminative Dictionary for sparse coding via Label Consistent K-SVD(LC-KSVD) followed by a nonlinear feature extraction method, Kernel Null Folley-Sammon Transform(KNFST), for classifying Open set samples on MNIST
|
Projects
|
|
A Sparse Hard-label Black-box Attack using Accelerated Proximal Gradient Methods
Course:CS 269: Adversarial Learning
Code
Report
Sparse adversarial attacks can fool deep neural networks (DNNs) by only perturb-
ing a few pixels (regularized by l0 norm). The resultant sparse and imperceptible
attacks are practically relevant, and indicate an even higher vulnerability of DNNs
that we usually imagined, as they pose a practical threat against real-world systems.
However, such attacks are more challenging to generate due to the optimization
difficulty by coupling the l0 regularizer and box constraints with a non-convex
objective. Moreover, such an attack leads to an NP-hard optimization problem. To
make it more challenging, we restrict ourself to l-infinity imperceptibility on the pertur-
bation magnitudes, while solving this optimzation function. We develop a novel
proximal graident based algorithm, SparseAPG, based on homotopy algorithm
of Schott et al. [2019], which works in the hard label black box untargetted label
setup to solve the above optimization function and generate human-imperceptible
adversarial sparse images.
|
|
An Analysis of Compression Methods for Deep learning networks
Course:CS 259: Learning Machines
Code
Report
We aim to quantify the performance of various net-
work compression algorithms by their application on literature
Deep learning models like ResNet on the basis of metrics
like accuracy, inference time and model size. We then move on to
analyze the combined performance of the various models and
effect on ensembling. We finally summarize the results in the
form of Takeaway messages and explore the reason behind the
inference boost through the core kernel analysis on the Titan V
GPU. Standalone compression methods were able to produce
upto 40x reduction in model size with under 5% accuracy
loss. Combining the
compression methods appropriately generated a 215x decrease
in model size with just an addition 1% loss in accuracy.
|
|
MARS-GM: Multi Headed Recommendation System using Graphical Modeling
Course:CS 249: Special Topics - Advanced Data Mining
Code
Report
Slides
GraphRec was introduced to model interactions between users and items in a systematic approach. Leveraging this network, we incorporate the latest state of art Attention mechanism, inspired from the Transformer architecture, to learn the attention weights between various elements of this graph to ensure embeddings are learnt in a coherent manner. We further introduce adverserial learning by introducing noise in the graph to model real world datasets and avoid overfitting. We add item-item interaction to improve the performance of model on a new dataset - "FilmTrust".
|
|
Emotional-Talking-Face-Generation-Using-Deformable-Convolutional-Networks
Course: CS269 - Advanced Topics in Artificial Intelligence: Deformable Models in Computer
Vision
Code
Report
Talking face generation is the process of generatinga video of a person saying the input audio with appropriate lip and facial features synced at each time step. In this work, we propose a hybrid approach by adding deformable layers and attention modules. We further compare our approach to purely deep learning based approaches and contrast them using the structural similarity (SSIM) andpeak signal-to-noise ratio (PSNR) metrics.
|
|
Genomic_Imputation_using_Deep_Learning
Course: M226 - Machine Learning for Bioinformatics
Code
Report
Video
Genome imputation refers to the statistical inference of unobserved genotypes. In this work we frame the genome imputation problem as a language generation task and leverage and train the existing deep learning approaches in an autoregressive fashion, to achieve superior results compared to previous approaches.
|
|
Semantic Image Inpainting using DCGAN
Course: CS736 - Medical Image Computing
Code
Slides
Implementation of paper Semantic Image Inpainting. Performed image inpainting by finding an optimal latent vector lying on the latent image manifold and closest to the given
corrupted image using context and prior loss followed by Poisson Blending
|
|
3D Object Detection and Semantic Map Generation
Robotic Vision Scene Understanding Challenge 2021
Guide: Prof. Sharat Chandran, IIT Bombay, Course CS763: Computer Vision
Code
Slides
Using RGB and depth images from the traversal of bot, performed 3D object detection leveraging object detection networks.
Created a 3D semantic map of the environment with bounding boxes around each object using 3D NMS algorithm
|
|
Image Toonification
Guide: Prof.Biplab Banerjee, Research Project
Code
Slides
Cartoonised real life images to the domain of Anime style images leveraging the network of Cartoon GAN.
Initialised the Generator with an Image Abstraction technique employing DoG and Bilateral filters to get better results
|
|
Controlling Epidemics and Economics Activity in Interacting Communities
Guide: Prof. D. Manjunath, Supervised Research Exposition
Code
Summary
We consider an SAIR model between two communities and try maximise their economic activity. The problem statement is practical considering the COVID-19 pandemic
where the government and each individual is put into a tradeoff situation where he wants to maximise his economics at the same time not expose too much that he gets infected. For a simpler case, we assume 2 commnities interacting, a sparse and rich community C1 and a densely populated and poor community C2
|
|
Efficient Neural Machine Translation
Course: CS626 - Speech, Natural Language Processing and the Web
Code
Report
NMT model based on RNNsearch model with minimal parameters and Time taken for training on Multi30K dataset, to achieve a decent Bleu score as compared to a standard Transformer
|
|
Image Inpainting using the Deep Image Prior
Course: GNR638 - Machine Learning for Remote Sensing-II
Code
Implementation of paper Deep Image Prior. Exploiting the inherent property of CNN to reluctantly fit on a noisy image when started with uniform noise to get off the
Prior term and reconstruct the original image in a zero-shot fashion. Producing excellent results even when 80% of pixels removed form original image.
|
|
Adversarial Reprogramming of Neural Networks
Course: CS663 - Digital Image Processing
Code
Report
Implementation of paper Adversarial Reprogramming of Neural Networks
. Computed a single adversarial perturbation added to all test inputs to reprogramme ImageNet classification model on CIFAR-10
|
|
Pipeline Processor IITB RISC
Course: EE309 - Microprocessors
Code
Report
16 bit 6-stage pipelined processor based on Little Computer Architecture. Executes 15 instructions with single and double wide fetch execution.
|
|
Maze Solver
Course: CS747 - Foundations of Intelligent and Learning Agents
Code
Report
Finds the shortest path from a given start point to multiple end points in a maze using Value Iteration algorithm.
|
|
Parse Trees Converter
Course: CS626 - Speech, Natural Language Processing and the Web
Code
Report
A tool to convert a Constituent Parse tree to Dependency Parse tree and Vice versa
|
|
Emotion TV
Institute Technical Summer Project
Code
Report
Predicts the mood of a person with additional features of turning on appropriate music and creating a caricature of the person
|