Posted on: 2018-07-15
Summary
Data Analyst with 4 years of experience along with data science expertise in utilizing machine learning, data analytics, python, data visualization tools and SQL programming. Strong SQL programming and experience with creating Database and converting and loading raw data into structured data. Strong experience with creating Data Analysis stories, Visualization reports using Python and Pandas. Growing knowledge and working on Time Series Data Analysis.
Technical Skills
Machine Learning: Linear regression, Logistic regression, Classifiers: Random Forest, Decision Tree, Ada Boost, Multinomial Naïve Bayes, SVM, Time Series Prediction, ARIMA
Imbalance-Learn: SMOTE
Statistical Methods: Hypothesis testing (Chi-square contingency test)
Programming Languages: Python, SQL, Panda Packages as Scikit-learn (Libraries for Classifiers, Model evaluation, Metrics, Cross-Validation, Feature Importance), NumPy, SciPy, Matplotlib, Seaborn
Data: Data cleaning, Data wrangling, Data visualization (using Tableau), SAS
Version Control: GitHub, Microsoft Visual Source Safe (VSS)
Development Tools: Microsoft Visual Studio 2010, SQL Server Management Studio 2008/2012, Fiddler, MiniProfiler
Operating Systems: Windows, UNIX
Microsoft Technologies: C#, ASP.Net Framework, ADO.Net
Databases: Microsoft SQL Server 2008/2012, Oracle 11g
*Microsoft Certified Technical Associate and Oracle certified Associate
Projects
AUTO-TICKET RESOLUTION SYSTEM May 2018 – June 2018
This project involves automating some of the processes of the organization’s service ticketing system where root cause can be automatically inferred from the ticket data.
Created a Text analysis model using Multinomial Naïve Bayes, Random forest and Decision Trees. Developed Feature Selection using SVM (Support Vector Machine) and Feature Engineering technique using Bag of Words and TF-IDF (Term Frequency – Inverse Document Frequency).
USED CAR DATABASE Feb 2018 – May 2018
Worked on the Public dataset from Ebay Germany which has over 370,000 used car data.
Created statistical machine learning models as Linear Regression, Lasso, Ridge and Random Forest that will predict pricing based on the used cars features. I built a Random Forest model with a good R-squared value of 0.83. I also developed feature importance table which have llist with weigh of major feature on pricing.
Github Link: https://github.com/Sneharani143/Capstone-Project-2---Used-Car-Database
IBM HR ANALYTICS: EMPLOYEE ATTRITION Dec 2017 - Feb 2018
Worked on the Public dataset from Kaggle which was created by IBM scientists. Size of the dataset was 48KB.
Created statistical results to uncover the factors that lead to employee attrition. I used python and machine learning models as Logistic Regression, Random Forest, Decision Tree and AdaBoost to analyze Imbalanced Class problems and was able to achieve this through SMOTE by randomly sampling the attributes from instances in the minority class. I built a logistic regression model with a good recall score of 0.72 that will allow management to create better decision-making in terms of employees who left the company.
Github Link: https://github.com/Sneharani143/CapstoneProject_HR-attrition/blob/master/Capstone_Project_1.ipynb
WEALTH-OF-NATIONS-VERSUS-LABOR-PARTICIPATION Oct 2017 - Nov 2017
Worked on World Bank Dataset having data from different countries in the world. This project is a combination of two different datasets: World Development Report 2013 and Wealth of Nations. Combined size of the Dataset was 300KB. Performed EDA on different features under different Income Group, Net Wealth and Women Participation. Developed statistical inference on how the wealth of nations and labor force participation of women varies.
Github Link: https://github.com/Sneharani143/Wealth-Of-Nations-versus-Labor-Participation/blob/master/Capstone_Project_Data Story.ipynb
Experience
Independent Contractor
Data Analyst / Data Administrator June 2016 – May 2017
Thomson Reuters
Data Analyst / Programmer July 2015 - May 2016
Aroha Technologies
Software Engineer March 2015 - July 2015
Inube
Programmer Feb 2015 - March 2015
PALLE Technologies
Intern July 2014 - Dec 2014
Education
Bachelor of Engineering in Computer Science
Visvesvaraya Technological University – JSSATE (Bangalore) 2010-2014
Relevant Courses: Programming, Mathematics, Database
Springboard 2018
Data Science Career Track Program
Relevant Courses: Python, SQL, R, Data Visualization, Machine Learning