I completed my graduation from Indian Institue of Technology (IIT), Delhi in Mathematics and Computing and I am currently working as a Principal Data Scientist at VMock India Pvt. Ltd. With an adaptive mind and broad perspective I love facing challenging problems and looking for creative solutions. Currently pursuing my passion for Technology and Arts. My dream is to be a part of a technological breakthrough that has the potential to impact billion of lives.
Kshitiz Sharma
kshitiz [dot] sharma [at] gmail [dot] com
Career Analytics Platform powered by AI, revolutionizing the candidate-employer-university dynamics. Delivering a comprehensive suite of services including career coaching, instant feedback, intelligent hiring tools, smart editors, and career analytics. Empowering clients with a one-stop platform for unparalleled career success.
Managed cross-functional Data Science teams (15 Data Scientists & Data Analysts) across business verticals, providing technical and stakeholder management and contributing to research and development of data driven solutions. Mentored teams working on diverse data science problems, including Ranking, Search, Retrieval, Analytics, Entity Recognition (NER), Large Language Models (LLMs), Language Generation (NLG), Grammar Correction (GEC), Computer Vision, Recommendation Systems and Ontologies.
Restructured Data Science teams and streamlined machine learning development cycle using MLFLow, AWS Sagemaker, EKS, resulting in 20-40% improvements in lead and cycle times.
Implemented containerization of machine learning applications, CI/CD Pipelines, Application Performance Monitoring (APM) & Code Inspection tools, Load Metrics, QA Automations for fault-tolerant and scalable systems thereby reducing infrastructure cost by over 50% for data science teams.
Mentored team in developing an in-house speech transcription pipeline with VAD (Voice Activity Detection), ASR (Speech Recognition), and Forced Alignment for word level timestamps, saving over few thousands USD/month on 3rd party external APIs and achieving a 29% reduction in word error rates.
Strategized new initiatives like BestMatch for intelligent hiring with explainable FitScores and SMART Interviews for in person interview practice featuring both verbal and non-verbal feedback.
Developed a chatbot for VMock utilizing OpenSearch as a vector database and GPT3.5 API, showcasing conversational process similar to LangChain.
Supervised a team of 3 people overseeing all research, development, reviews and deployment for the Resume Parser, powering the flagship Resume Product used across more than 130+ Countries and in leading Educational Institutes such as CMU, Stanford, Chicago Booth, University of Oxford, Columbia, Kellogg, INSEAD etc.
Launched multilingual parsing capability, solving the Reverse Translation and Token Alignment problem using unsupervised IBM Model 1 and HMM models.
Created an in-house Joint Machine Translation and Token Alignment system with a Transformer-based encoder-decoder model, achieving a 44 BLEU Score and a 7 AER.
Engineered a highly efficient event-driven task-queue architecture for the Resume Parser, leveraging tools like Celery, Redis, and SNS, significantly reducing infrastructure costs and job fail rates.
Implemented a high-speed N-gram super-section classifier with outstanding performance, boasting an impressive 81% top-1 accuracy and an exceptional 95% top-3 accuracy.
Finetuned encoder LMs (BERT, RoBERTa) using PyTorch and Hugging Face for superior classification and entity recognition achieving 5-12% improvement over BiLSTM-CRF on various datasets.
Improved and maintained a robust resume parser that required converting unstructured data into structured data and extracting useful information using techniques of NLP.
Trained in-house BiLSTM-CRF based NERs with Character Embeddings and Attention Mechanism delivering high recall and precision (>92 % F1 Score) for recognizing various entities as positions, organizations, locations, universities, degrees, section titles etc.
Executed the migration of a complex codebase of ~50,000 lines from Python2 to Python3, ensuring compatibility and eliminating deprecated features, resulting in enhanced performance and maintainability.
Implemented a Machine Learning algorithm using VGG16 ConvNet to extract image features and a LSTM-based recurrent layer for label predictions, achieving a significant 5-6% improvement over the company's best model.
Designed a text and vector similarity algorithm to filter Instagram images based on user text query and hashtags, incorporating 3rd party image tagging API and user-created hashtags for enhanced accuracy.
Developed an advanced autonomous agent using computer vision and deep reinforcement learning techniques, including Q-Learning algorithms and experience replay for real-world object tracking.
Utilized a drone and realistic physics simulations to track a moving car, leveraging a customized version of Microsoft AirSim Simulator for training and testing.
Evaluated Object Detection and Tracking models such as region proposal based classification R-CNN models (Fast, Faster R-CNN) as well as regression based models such as YOLO for detection.
Utilized detection models and LSTMs to extract features, generate heatmaps, and handle occlusions for robust tracking in sequences/videos.
Tensorflow & Keras • PyTorch
ONNX • DeepSpeed • Hugging Face
Nvidia Triton • OpenCV •
Python • C/C++ • Java • OCaml
CUDA • OpenMP/MPI • LaTeX • ONNXjs
Golang • Android Application Development
MLFlow • DVC • Kubernetes
Celery • Redis • AWS SQS • Git
AWS ECS • AWS Fargate • Docker
AWS Sagemaker • AWS Opensearch • New Relic
SQLAlchemy • MongoDB
HTML • CSS • PHP • JS • JQuery
Laravel • Bootstrap • Web2py
Jekyll • Github Pages
Adobe Photoshop • After Effects • Macromedia Flash
Autodesk Maya • Inventor • AutoCAD
B.Tech. Project, Autonomous Object Tracking with UAVs
Summer Research Student, Object Detection and Tracking for Driverless cars
Web Management Coordinator, Alumni Affairs & International Programmes (AAIP)
Fine Arts and Crafts Club (FAC) Representative, Zanskar Hostel
Hockey and Athletics