About Me

An aspiring data scientist actively looking for full time opportunities in data science and analytics. At present, I'm pursuing masters in Information Systems at the University of Texas at Arlington.

You can download a printable version of my resume here.

Experience

Graduate Research and Teaching Data Intern

2016
University of Texas, Arlington

• Mentoring students on analytics coursework.

• Experiment design, data collection and processing for research studies.

• Exploratory data analysis to extract preliminary insights with the Big Data framework Apache Spark.

• Statistical study to establish the relationship among factors, and predictive modeling to predict performance of students.

• Reports and Interactive Dashboards development with Tableau.

User Assistance Specialist

2012 - 2015
SAP

• Documented and maintained user assistance (UA) deliverables for SAP SaaS solutions, in order to assist the consultants and end users implement and work with the solutions.

• Developed product scope documents for solutions based on customer requirements and ensured feedbacks were appropriately addressed and included in the product documentation backlog.

• Facilitated the development and release of SAP Business ByDesign, SAP Cloud for Travel and Employee Central SaaS solutions.

• Worked closely with Software Architects, User Interface Designers, Developers and QA to translate high-level requirements into User Stories and review their implementation.

• Defined product scope for solutions based on business requirements, and ensured feedbacks were appropriately addressed.

• Designed wireframes and architected a complete system for Production Module, perfecting the business workflows and meticulously improving the user interface in an Agile software development environment.

• Performed system, acceptance and usability testing of the solutions to ensure that the deliverables were consistent with requirements.

• Demonstrated the process innards of key engineering modules, and their interplay to partners and customers, which was in one sense pitching to them on the capability of our design.

• Supported SAP consultants, customers and business partners with product specific clarifications before and after Go-Live.

• Performed ad-hoc analysis and reporting to support the BI needs of the management.

• Recommended and implemented the use of reusable text blocks, which resulted in reduction of the translation costs by 15%.

Research

A Time Series Analysis of Bit Coin and Crude Oil Price - The analysis involved univariate and bivariate analysis, to study the relationship between Crude Oil and Bitcoin prices. Univariate analysis involved the study on effect of past price of bitcoin on it's future values using ARDL models. Multivariate analysis involved the estimation of causality among the variables and modeling the relationship accordingly. Breakpoint model was incorporated in order to capture the high volatility in the price of Bitcoin over the years.
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and Cassandra - The research consolidated the interpretation of NoSQL systems, on the basis of performance, scalability and data aggregation, and compares the types of NoSQL databases based on their implementation and maintenance.
Enterprise Process Integeration - The research subjects were selected from a huge pool, considering their job roles at their respective companies (mostly MNCs) and their experience with enterprise systems. They answered a questionnaire to help us understand the business process reengineering and maintenance methodologies adopted by their firms, and also to know how prepared were their management and other influencers for a process change in their organization. The findings were then compared to understand how firms and their management handled the changes without causing discomfort to its employees.
Career Opportunities in Business Analytics - What does it take to get there? - The research involved identifying the job prospects of business and data analytics in the years to come. Of all the various analytics based job streams, the analysis focused on Data Science, Portfolio Analytics, Security Analytics, CRM Analytics and Online Marketing Analytics. The research throws light on the options available for the general public to secure an opportunity in these fields. The details include the best institutions that offer these courses, the mean salary range over the years, the influencers in these areas whom an aspirant can follow, and some information on the nature of the jobs.

Projects

Over the past few years I have worked on the following academic and pet projects. They are predominently in the fields of data science, machine learning and big data analytics. For work samples, click the respective titles.

Revenue and Cost Optimization Regression Modeling for a Burger Joint - Based on the sales and operations data, price and cost optimization statistical models were developed. The models performed estimation and forecasting of revenue, sensitivity analysis to estimate the margin and price elasticity among products and services. [Tools: Excel, R, SAS]
Business Intelligence Report Generation for a Bike Manufacturer - Master and transaction data were loaded into a Infocube structure(SAP). SAP BusinessObjects and Tableau were used to track business metrics, generate dashboards for the management to visualize and evaluate the performance. [Tools and Technologies: SAP, SAP NetWeaver, Tableau, Plot.ly, BEx Analyzer, Excel, SAP Lumira]
Music Artist Recommendation on Yahoo! Music Data - PySpark application to recommend users the music artists' they might like to listen to. [Tools and Technologies: Apache Spark, Databricks CE, MLLib, SparkSQL]
Iris Flower Type Prediction - Pyspark binary classification model based on Logistic Regression algorithm to predict the type of the Iris flower. [Tools and Technologies: Apache Spark, Databricks CE, MLLib]
Analyze On-time Performance of US Domestic Flights - Hadoop MapReduce application to report maximum departure delay for each originating airport, average arrival delay by flight number and minimum arrival delay for all origin-destination airport combinations. [Tools and Technologies: MapReduce, Cloudera Hadoop Distribution, HDFS, VMWare, Excel]
Yahoo! Answers Text Validation - A python data application to validate the accuracy of the best answer selected by Yahoo! Answers. [Tools and Technologies: Python, R, NLTK, Gensim, LDA, Sklearn]
Twitter Text Analysis on US Presidential Candidates - A python based natural language processing application to perform sentiment analysis on US presidential candidates. [Tools and Technologies: Python, Tweepy API, matplotlib, nltk]
Marketing Strategy Analysis for Apple's iPhone 7 - Based on our research, we proposed Apple Inc., the following recommedations. Market iPhone 7 to India's IT sector, Expand to other industries in India and, Expand to other markets and US. Market research tools like ladder analysis, SWOT analysis, share holder models, position maps, brand association maps, value leadership analysis and interviews were used to devise the strategies.

Side Projects

Real Estate Price Prediction Flask Web Application - Flask web application to predict the house price based on a machine learning model. The Decision Tree prediction model was trained to obtain the optimal price by fine tuning the hyper parameters. The model was finally deployed on Heroku cloud platform. [Tools and Technologies: Python, Pandas, Numpy, Pickle, Heroku, HTML, CSS, Git, Flask, Sklearn, matplotlib]
Analyzing the Stock Market Data - Apple, Amazon and Google stock market data analysis using Quantmod from Yahoo! Finance data.
Twitter Bot - A twitter bot application to auto reply and auto post tweets.

Skills & Proficiency

Python & PySpark

SQL & SparkSQL

R & Microsoft Excel

STATA, SAS & Eviews

Tableau & SAP Lumira

Scikit-learn, Pandas, Gensim, TextBlob & NLTK

D3.JS, Matplotlib & ggplot

iPython Notebook & PyCharm

Courses

  • Advanced Statistical Methods
  • Data Warehousing and Business Intelligence
  • Principles of Data Mining
  • A Programming Approach To Data Science
  • Applied Business and Economics Data Analysis
  • Enterprise Resource Planning
  • Big Data Analytics
  • Applied Time Series Analysis
  • Introduction to Operations Research/ Management Science
  • Information Systems Project Management
  • Management of Information Technologies
  • Marketing Management