Houda - Data Scientist SAS
Ref : 200811C003-
57070 METZ
-
Data Scientist, Data Analyst (46 ans)
-
Totalement mobile
-
En portage salarial
Work Experience
10 2021 – 07 2022: Machine learning Engineer |Openclassrooms:
- Design a food recommendation system using multivariate analysis techniques.
- Optimize classification models (Xgboost, Adaboost, Random Forest, SVM, KNN) for building energy consumption prediction.
- Segment customers of an e-commerce site and maintain stability over time: unsupervised algorithms (k-means, DBSCAN, GMM, RFM).
- Automatically categorize questions using supervised and unsupervised models (LDA, GENSIM, OneVSRest classifier), and NLP (TF-IDF, WORD2VEC,
USE, BERT, Hugging Face).
- Deploy a tag prediction application via Flask and Heroku.
- Classify images using Deep Learning (CNN, VGG).
- Deploy an image prediction application with Streamlit and GRADIO.
01 2021 – 06 2021: Statistical consultant | Elmhurst College USA
Project: Mathematical modeling of cancer using gene therapy
- Find the conditions for an effective therapy by determining the important parameters for the prediction of the model.
- Use the Monte Carlo method for sensitivity analysis (Python's SALib library).
07 2020 – 12 2020: Statistical consultant | ZIP Sénégal
Project: Development of machine learning algorithms for predicting water quality from data collected through IoT sensors.
- Deployment of machine learning models: KNN, SVM, CNN and evaluate their performance using R2 and RMSE.
08 2017 – 06 2020: Head of Statistical Analysis |CDG 54/In-Pact GL
Project: Implementation of data collection tools and production of an analytical
plan for the processing and analysis of surveys for a sample of more than 500
communities
- Impute missing data, create scores,
- Perform multivariate analyzes (PCA, multiple linear regression, ANOVA, nonparametric test).
Project: Mentoring of an intern-student in Master 2 “Work Psychology”:
- Analyze feelings through audio interviews using natural language processing
(feature engineering, corpus),
- Use classification models (SVM, KNN).
Project: Development of three types of insurance tables (Kaplan-Meier estimator,
Whittaker-Henderson, competing risks, multi-state Markovian model in survival analysis).
06 2016 - 10 2016: Statistician | LASER Analytica
- Development of statistical methods for the analysis of clinical trial data with new approaches.
- Data analysis using SAS and R language.
- Quantitative analysis of the QLQ C-30 questionnaire.
- Imputation of missing data, PCA, GLM, non-parametric tests
Fonction
Data scientist
Project
Water quality prediction using machine learning methods
Statistical tools
R Studio (tseries, forecast, tidyverse, ggplot2), Python (pandas, sklearn, svm, numpy, ARIMA, SARIMA, torch)
Mission
Real-time data points collected by IoT devices. They were cleaned by performing a box plot analysis for outlier detection.
To find the dependent variables, I performed correlation analysis to extract the possible relationships between the parameters, and Pearson correlation coefficient matrix was acquired.
I build a deep neural network architecture using Keras and Tensorflow to provide water quality forecasting (ARIMA, Convolutional Neural Network, Long Short-Term Memory, K Nearest-Neighbor KNN et support Vector machines SVM).
Languages: French (fluent), English (professional), Italian (basic)
Trainings & Certifications
2019: TALEND BIG DATA Advanced
2017: COGNOS
2016: Certification on Information Systems Development and Organization Analysis at Lorraine University
2011: SAS V9
2004 – 2009: Ph.D. in Applied Mathematics at Rouen University
2003 – 2004: Master’s Degree in Stochastic Modeling and Analysis at Rouen University
Qualifications
Data Visualization and Communication
Datamining: Neural network, survival analysis, cluster analysis, classification methods, Bayesian networks
Project Management
Working with unstructured data
Technical Background
Operating Systems: Windows 95/XP, UNIX AIX V6
RDBMS: MySQL, Oracle Database, MongoDB
Reporting Tools: R Markdown, R Shiny, SAS Visual Analytics, Python, Cognos
Web Development: Programmation en langage Java, PHP, HTML5, Angular JS
Project management
Financial management: Analysis of project needs, Establishment and proposal of possible scenarios
Project management: Strategic benchmarking, intellectual property, innovation strategy