• Hi! I'm Kshitiz Sharma

    Data Scientist and Innovator.

  • I am a working professional, pursuing his passion for Technology and Arts.

    With an adaptive mind and broad perspective I love facing challenging problems and looking for elegant solutions.

  • Take a look at my CAREER PATH and some of MY CREATIVE WORK or GET IN TOUCH.

    My works vary from technical engineering projects to mind refreshing artworks.

Download CV

ABOUT ME


I completed my graduation from Indian Institue of Technology (IIT), Delhi in Mathematics and Computing and I am currently working as a Principal Data Scientist at VMock India Pvt. Ltd. With an adaptive mind and broad perspective I love facing challenging problems and looking for creative solutions. Currently pursuing my passion for Technology and Arts. My dream is to be a part of a technological breakthrough that has the potential to impact billion of lives.

CONTACT DETAILS

Kshitiz Sharma
kshitiz [dot] sharma [at] gmail [dot] com

Experience

Relevant Work and Research Experiences

  • July 2018 - Present

    VMock India Pvt. Ltd., Gurgaon, IN

    Career Analytics Platform powered by AI, revolutionizing the candidate-employer-university dynamics. Delivering a comprehensive suite of services including career coaching, instant feedback, intelligent hiring tools, smart editors, and career analytics. Empowering clients with a one-stop platform for unparalleled career success.

    Principal Data Scientist (September 2021 - Present)

    Managed cross-functional Data Science teams (15 Data Scientists & Data Analysts) across business verticals, providing technical and stakeholder management and contributing to research and development of data driven solutions. Mentored teams working on diverse data science problems, including Ranking, Search, Retrieval, Analytics, Entity Recognition (NER), Large Language Models (LLMs), Language Generation (NLG), Grammar Correction (GEC), Computer Vision, Recommendation Systems and Ontologies.

    • Restructured Data Science teams and streamlined machine learning development cycle using MLFLow, AWS Sagemaker, EKS, resulting in 20-40% improvements in lead and cycle times.

    • Implemented containerization of machine learning applications, CI/CD Pipelines, Application Performance Monitoring (APM) & Code Inspection tools, Load Metrics, QA Automations for fault-tolerant and scalable systems thereby reducing infrastructure cost by over 50% for data science teams.

    • Mentored team in developing an in-house speech transcription pipeline with VAD (Voice Activity Detection), ASR (Speech Recognition), and Forced Alignment for word level timestamps, saving over few thousands USD/month on 3rd party external APIs and achieving a 29% reduction in word error rates.

    • Strategized new initiatives like BestMatch for intelligent hiring with explainable FitScores and SMART Interviews for in person interview practice featuring both verbal and non-verbal feedback.

    • Developed a chatbot for VMock utilizing OpenSearch as a vector database and GPT3.5 API, showcasing conversational process similar to LangChain.

    Senior Data Scientist (December 2019 - September 2021)

    Supervised a team of 3 people overseeing all research, development, reviews and deployment for the Resume Parser, powering the flagship Resume Product used across more than 130+ Countries and in leading Educational Institutes such as CMU, Stanford, Chicago Booth, University of Oxford, Columbia, Kellogg, INSEAD etc.

    • Launched multilingual parsing capability, solving the Reverse Translation and Token Alignment problem using unsupervised IBM Model 1 and HMM models.

    • Created an in-house Joint Machine Translation and Token Alignment system with a Transformer-based encoder-decoder model, achieving a 44 BLEU Score and a 7 AER.

    • Engineered a highly efficient event-driven task-queue architecture for the Resume Parser, leveraging tools like Celery, Redis, and SNS, significantly reducing infrastructure costs and job fail rates.

    • Implemented a high-speed N-gram super-section classifier with outstanding performance, boasting an impressive 81% top-1 accuracy and an exceptional 95% top-3 accuracy.

    • Finetuned encoder LMs (BERT, RoBERTa) using PyTorch and Hugging Face for superior classification and entity recognition achieving 5-12% improvement over BiLSTM-CRF on various datasets.

    Data Scientist (July 2018 - December 2019)

    Improved and maintained a robust resume parser that required converting unstructured data into structured data and extracting useful information using techniques of NLP.

    • Trained in-house BiLSTM-CRF based NERs with Character Embeddings and Attention Mechanism delivering high recall and precision (>92 % F1 Score) for recognizing various entities as positions, organizations, locations, universities, degrees, section titles etc.

    • Executed the migration of a complex codebase of ~50,000 lines from Python2 to Python3, ensuring compatibility and eliminating deprecated features, resulting in enhanced performance and maintainability.

  • May 2017 - July 2017

    Artifacia, Bangalore, IN

    Research Intern (ML)

    • Implemented a Machine Learning algorithm using VGG16 ConvNet to extract image features and a LSTM-based recurrent layer for label predictions, achieving a significant 5-6% improvement over the company's best model.

    • Designed a text and vector similarity algorithm to filter Instagram images based on user text query and hashtags, incorporating 3rd party image tagging API and user-created hashtags for enhanced accuracy.

  • May 2016 - December 2018

    Indian Institue of Technology (IIT), Delhi, IN

    B.Tech. Project: Autonomous Object Tracking With Uavs

    • Developed an advanced autonomous agent using computer vision and deep reinforcement learning techniques, including Q-Learning algorithms and experience replay for real-world object tracking.

    • Utilized a drone and realistic physics simulations to track a moving car, leveraging a customized version of Microsoft AirSim Simulator for training and testing.

    Summer Research Intern: Object Detection And Tracking For Driverless Cars

    • Evaluated Object Detection and Tracking models such as region proposal based classification R-CNN models (Fast, Faster R-CNN) as well as regression based models such as YOLO for detection.

    • Utilized detection models and LSTMs to extract features, generate heatmaps, and handle occlusions for robust tracking in sequences/videos.

TECHNICAL SKILLS

MACHINE LEARNING FRAMEWORK

Tensorflow & Keras • PyTorch
ONNX • DeepSpeed • Hugging Face
Nvidia Triton • OpenCV •

PROGRAMMING LANGUAGES

Python • C/C++ • Java • OCaml
CUDA • OpenMP/MPI • LaTeX • ONNXjs
Golang • Android Application Development

PLATFORM, TOOLS AND LIBRARIES

MLFlow • DVC • Kubernetes
Celery • Redis • AWS SQS • Git
AWS ECS • AWS Fargate • Docker
AWS Sagemaker • AWS Opensearch • New Relic
SQLAlchemy • MongoDB

WEB, VFX, 3D, DESIGN

HTML • CSS • PHP • JS • JQuery
Laravel • Bootstrap • Web2py
Jekyll • Github Pages
Adobe Photoshop • After Effects • Macromedia Flash
Autodesk Maya • Inventor • AutoCAD

Creative Portfolio

Highlights

Sketching

Free-hand Sketching

3D Modelling

AutoDesk Maya

House Day Decoration

Themed Event

Animation and Video Editing

Adobe Flash and Video Editors

Web Design and Development

Jekyll, HTML, CSS, Javascript

Education

Relevant Educational Experiences

  • 2014 - 2018

    Indian Institue of Technology (IIT), Delhi, IN

    B.Tech. in Mathematics and Computing

    • B.Tech. Project, Autonomous Object Tracking with UAVs

    • Summer Research Student, Object Detection and Tracking for Driverless cars

    • Web Management Coordinator, Alumni Affairs & International Programmes (AAIP)

    • Fine Arts and Crafts Club (FAC) Representative, Zanskar Hostel

    • Hockey and Athletics

Contact Me

Social Links