Wanyi H
About
Wanyi H is from Los Angeles, California, United States. Wanyi works in the following industries: "Marketing & Advertising", "Higher Education", "Non-profit Organization Management", "Financial Services", "Commercial Real Estate", and "Chemicals". Wanyi is currently Analytics Manager at VaynerMedia, located in Los Angeles, California, United States. In Wanyi's previous role as a Senior Data Analyst at VaynerMedia, Wanyi worked in Los Angeles, California, United States until Jul 2021. Prior to joining VaynerMedia, Wanyi was a Data Analyst at VaynerMedia and held the position of Data Analyst at Los Angeles, California. Prior to that, Wanyi was a Data Analyst at Dunn-Edwards Corporation, based in Greater Los Angeles Area from Aug 2018 to Jan 2020. Wanyi started working as Data Science Mentor at PeopleSpaceOC in Orange County, California Area in Jul 2018. From Mar 2018 to Jun 2018, Wanyi was Data Analyst at GENRICH Family Office, based in Anaheilm Hills, CA. Prior to that, Wanyi was a Fake News Classification Model (Project) at University of California, Irvine - The Paul Merage School of Business, based in United States from Mar 2018 to Jun 2018. Wanyi started working as Twitter Spatial Analysis - Roseanne (Project) at University of California, Irvine - The Paul Merage School of Business in Orange County, California Area in Mar 2018.
Come check out Wanyi H's email address on finalscout.com, a free professional database with 500 million business professionals and 200 million companies.
Wanyi H's current jobs
Wanyi H's past jobs
• Built statistical models (Beta Geometric Negative Binomial Distribution - BG/NBD model and Gamma-Gamma model) to calculate Customer Lifetime Value using Python and SQL in order to determine how to allocate resources to retain customers. • Extracted multi-dimensional datasets with over 60+ million rows from multiple data sources (MySQL database, SAP database and CSV files), transformed and loaded data into Power BI and designed a star schema data model to build a mathematical model and calculate the optimal price for each paint product, recommending the optimal price for our sales reps. • Scraped project data using SharePoint API and established efficient, automated Power BI reports with key metrics to support senior management. • Conducted a cohort analysis in Python to find the relationship between different marketing campaigns and customer acquisition. • Utilized RFM (Recency, Frequency, Monetary Value) analysis to segment customers based on their transactions data to optimize the email campaign for our marketing team. • Improved accessibility and usability of customer data by deploying data visualization techniques in Power BI to include statistical graphs. • Applied statistical model (ARIMA Model) using R to forecast annual sales of the top 10 paint products. • Analyzed transactions data in different classification groups using Python to recommend product bundles for customers, achieving more than 5% higher sales vs historical performance. • Utilized ARIMA model, linear regression model, and XGBoost model to predict sales of the most popular paint products. • Compared customer purchasing behavior (frequency, recency and product mix) before and after making a complaint using SQL and Python.
• Instructed 10+ students in data science topics (SQL (MySQL), ETL, Automated Data Reporting in Tableau, Statistics, Machine Learning). • Guided students with knowledge in data analysis to clean data in different data types and to conduct statistical analysis using Python and Tableau for final Sun Country Airline (SCA) Case Project. • Led students to validate and interpret business insights and actionable recommendations to help SCA executives achieve business objective.
• Conducted predictive analysis to forecast cash flows and calculate IRR (Internal Rate of Return) in different investment options (e.g. hard money loans, real estate investment) by Excel (Pivot table). • Utilized Google Analytics to identify potential customers and improved user engagement (page views, session duration) by A/B Testing. • Extracted customer data and loaded into Tableau to create dashboards with important metrics and KPI to support senior leadership in decision making.
• Extracted 5.24G twitter data in JSON format using Python by Twitter Streaming API to record netizens’ real-time discussion on Roseanne’s racist tweet and uploaded data to Amazon S3 Bucket. • Loaded data from Amazon S3 to an EMR Cluster and opened a Zepplin Notebook to transform JSON files into dataframes and cleaned and joined separate dataframes using PySpark and SQL. • Utilized MapReduce in Zepplin Notebook to compare frequent keywords in different cities using Pyspark and loaded data into Tableau for Geo-Mapping analysis and showed hot areas in this topic at different times.
Performed sentiment analysis on 25,000 IMDB movie reviews in Python by using NLTK, sklearn, and NumPy • Compared rule-based approach, supervised machine learning approach (Logistic Regression and Random Forests), and unsupervised machine learning approach (K-means Clustering) • Analyzed false positives and false negatives by error analysis to contrast machine learning models • Improved the model accuracy from 50% to 89.73% and utilized ELI5 to examine significant features
• The current model was suboptimal for pricing risk of insurance policies • Assessed existing model, determined feature space was restricted, conducted feature engineering and selection exercises • Assessed effects of geographical features • Feature exploration using heat maps, PDFs, assessed policyholder demographics distribution, mortality rate and distribution based on the state of residence • Used Tableau to develop insights for assessing the geographical influence • Analysis of >1GB dataset conducted in R and SQL • Generated mortality summary of 10-years data • Tuned new Generalized Linear Models (GLM) • Improved Actual/Expected (A/E) ratio from 96.47% to 98.64%
• To understand the relationship between student diet and academic performance with the goal of suggesting food selection choices to improve academic outcome • Used linear regression to infer relationships between behavioral and physical metrics and GPA • Conducted questionnaire-based assessment of food preferences • Used Python's NLTK to do Natural Language Processing (NLP) techniques such as word frequency and word cloud to visualize food preferences • Recommended food bundles to food suppliers that would drive an increase in academic performance
Identifying factors which contribute to electricity peak load and utilizing cluster analysis to help utility companies reduce peak load • Visualized data and explored the relationship between electricity use and weather in Tableau and applied Weka to cluster households into three groups according to electricity-consumption behaviors. • Analyzed the causes of electricity peak load and recommended four action plans to the utility company
Calculated economic impact of Science Park in order to help it develop the better business strategy • Developed analytical methods to calculate Science Park's economic impact in 2015 by Excel (VLOOKUP) based on annual lease record, helping Science Park improve its business strategy • Designed questionnaire to obtain annual revenue data and presented results and recommendation to Board of Directors in Science Park. Compiled a detailed report to describe services provided for IBM