This button will open the login/register page in a new tab. After logging in, come back to this page and refresh your browser.
Data Analyst with years of experience along with data science expertise in utilizing machine learning, data analytics, python, data visualization tools and SQL programming. Strong SQL programming and experience with creating Database and converting and loading raw data into structured data. Strong experience with creating Data Analysis stories, Visualization reports using Python and Pandas. Growing knowledge and working on Time Series Data Analysis.
Machine Learning: Linear regression, Logistic regression, Classifiers: Random Forest, Decision Tree, Ada Boost, Multinomial Naïve Bayes, SVM, Time Series Prediction, ARIMA
Statistical Methods: Hypothesis testing (Chi-square contingency test)
Programming Languages: Python, SQL, Panda Packages as Scikit-learn (Libraries for Classifiers, Model evaluation, Metrics, Cross-Validation, Feature Importance), NumPy, SciPy, Matplotlib, Seaborn
Data: Data cleaning, Data wrangling, Data visualization (using Tableau), SAS
Version Control: GitHub, Microsoft Visual Source Safe (VSS)
Development Tools: Microsoft Visual Studio , SQL Server Management Studio /, Fiddler, MiniProfiler
Operating Systems: Windows, UNIX
Microsoft Technologies: C#, ASP.Net Framework, ADO.Net
Databases: Microsoft SQL Server /, Oracle g
*Microsoft Certified Technical Associate and Oracle certified Associate
AUTO-TICKET RESOLUTION SYSTEM May – June
This project involves automating some of the processes of the organization’s service ticketing system where root cause can be automatically inferred from the ticket data.
Created a Text analysis model using Multinomial Naïve Bayes, Random forest and Decision Trees. Developed Feature Selection using SVM (Support Vector Machine) and Feature Engineering technique using Bag of Words and TF-IDF (Term Frequency – Inverse Document Frequency).
USED CAR DATABASE Feb – May
Worked on the Public dataset from Ebay Germany which has over , used car data.
Created statistical machine learning models as Linear Regression, Lasso, Ridge and Random Forest that will predict pricing based on the used cars features. I built a Random Forest model with a good R-squared value of .. I also developed feature importance table which have llist with weigh of major feature on pricing.
Github Link: https://github.com/Sneharani/Capstone-Project----Used-Car-Database
IBM HR ANALYTICS: EMPLOYEE ATTRITION Dec - Feb
Worked on the Public dataset from Kaggle which was created by IBM scientists. Size of the dataset was KB.
Created statistical results to uncover the factors that lead to employee attrition. I used python and machine learning models as Logistic Regression, Random Forest, Decision Tree and AdaBoost to analyze Imbalanced Class problems and was able to achieve this through SMOTE by randomly sampling the attributes from instances in the minority class. I built a logistic regression model with a good recall score of . that will allow management to create better decision-making in terms of employees who left the company.
WEALTH-OF-NATIONS-VERSUS-LABOR-PARTICIPATION Oct - Nov
Worked on World Bank Dataset having data from different countries in the world. This project is a combination of two different datasets: World Development Report and Wealth of Nations. Combined size of the Dataset was KB. Performed EDA on different features under different Income Group, Net Wealth and Women Participation. Developed statistical inference on how the wealth of nations and labor force participation of women varies.
Data Analyst / Data Administrator June – May
Data Analyst / Programmer July - May
Software Engineer March - July
Programmer Feb - March
Intern July - Dec
Bachelor of Engineering in Computer Science
Visvesvaraya Technological University – JSSATE (Bangalore) -
Relevant Courses: Programming, Mathematics, Database
Data Science Career Track Program
Relevant Courses: Python, SQL, R, Data Visualization, Machine Learning