Data_engineer

Posted on: 2024-01-18

 

MURIEL BANZE

1140 S Wabash Ave, Chicago, Illinois, 60605 | (585) 957-2936 | [email protected]

 

TECHNICAL SKILLS

Programming Languages: Python, SQL, SAS, and R programming.

Libraries: pandas, NumPy, SciKit-Learn, matplotlib, Beautiful Soup, ggplot2, shiny

Databases: MySQL, NoSQL, MongoDB, Neo4j, Oracle, Snowflake

Software Applications: Amazon Web Services AWS, Confluence, Kinesis, SqlDBM, MySQL workbench, MuleSoft, Tableau, Pentaho, Datadog, JIRA, docker, Git, Power BI.

 

WORK EXPERIENCE

Foundry, a Digital Currency Group Company, Rochester, NY                                             Aug. 2022 – Oct. 2023

Data Engineer I

  • Recognized by CNBC for delivering global geolocation-based data insights, showcasing expertise in data collection, analysis, and visualizations through tableau in coordination with marketing team, compliance, and legal team.
  • Spearheaded API driven data acquisition mechanisms and high-volume data ingestion using AWS, effectively managing 5 TB data hourly into JSON, Parquet formats, achieving a remarkable 30% reduction in ingestion time.
  • Proficient in dimensional modeling, designing star and snowflake schemas for data warehousing using SqlDBM.
  • Created API solutions using MuleSoft to expose data to external clients, streamlining data integration processes.
  • Architected data infrastructure and optimized data retrieval by crafting SQL queries for tables, views, materialized views, and streams in Snowflake.
  • Leveraged cloud platform serverless architecture with AWS Lambda functions, Kinesis FireHose streams, SQS, and SNS to process and transmit data efficiently for data products.
  • Orchestrated data ETL processes using AWS Glue crawlers, jobs, and workflows, automating data transformations, Athena data catalog management, written project documentation and ensuring data quality.
  • Established robust data governance, QA, data integrity, monitor, and alerting systems with Datadog, proactively identifying and addressing system performance issues.

 

Foundry, a Digital Currency Group Company, Rochester, NY                                             Jan. 2022 – Aug. 2022

Data Engineer Co-op

  • Designed ETL workflows to extract data from the Avalanche blockchain via REST APIs, performed data manipulation to convert it into key financial data essential for Foundry's staking operations, storing it in MySQL DB.
  • Created Tableau dashboards visualizing insights into blockchain transactions, addresses and rewards.
  • Drove profitability analysis for Foundry's staking teams, leveraging Excel, SAS and SQL for rewards and ROI.
  • Achieved 95% data accuracy, and reliability through rigorous unit testing to validate data pipelines.
  • Configured GitLab CI/CD pipelines for quality control, automate code integration, and deployment processes.

 

Adcrack Media Private Limited, Bengaluru, India                                                                Sep. 2018 – Nov. 2018

Test Engineer Intern

  • Orchestrated collaboration with product managers and application developer team members to strategize portal functionality, client requirements gathering, design reports, for advertising publication.
  • Conducted user acceptance testing, ensuring 90% improvement in account maintenance and client experience.
 

ACADEMIC PROJECTS

Dimensional Data Modeling for ‘The Product Company’                                                   Aug. 2021 – Dec. 2021

·     Designed relational Star Schema enterprise data models for efficient data warehousing, processing, and analysis.

·     Automated data extraction, transformation, and loading using ETL toolsets Pentaho, SAS, and MySQL Workbench.

·     Enhanced data accuracy and historical reporting with Slowly Changing Dimensions (SCD) implementation.

 

Machine Learning for Substance Use Disorder Treatment Success Prediction.                 Aug.2021 – Dec. 2021

·     Conducted data analysis and fine-tuned the model through hyperparameter tuning, achieving 90% accuracy.

·     Implemented the Random Forest algorithm using healthcare data with 80k+ records and 30 patient characteristics.

·     Created Tableau data visualization dashboards to analyze datasets and visualize drug consumption methods and interpret recovery timelines using data driven decisions.

 

EDUCATION

Rochester Institute of Technology, Rochester, NY                                                                                Aug. 2022

Master of Science, Information Technology and Analytics                                                                         GPA: 3.73

KLS Gogte Institute of Technology, Belgaum, KA, India                                                                        June 2018

Bachelor of Engineering, Computer Science