Entreprises, SSII, DSI
Trouvez rapidement les meilleurs intervenants pour tous vos projets informatiques
Photo de Houda, Data Scientist SAS

Houda Data Scientist SAS

CV n°200811C003
  • Profil

    Data Scientist, Data Analyst (42 ans)

  • Domicile

    57070 METZ

  • Mobilité Totalement mobile
  • Disponibilité Actuellement disponible
  • Statut En portage salarial
  • Domaines d'expertise

    Santé / Médical, Etudes / développements, Energie

Compétences techniques
SAS
SQL
Études et formations

Fonction
Data scientist
Project
Water quality prediction using machine learning methods
Statistical tools
R Studio (tseries, forecast, tidyverse, ggplot2), Python (pandas, sklearn, svm, numpy, ARIMA, SARIMA, torch)

Mission
Real-time data points collected by IoT devices. They were cleaned by performing a box plot analysis for outlier detection.
To find the dependent variables, I performed correlation analysis to extract the possible relationships between the parameters, and Pearson correlation coefficient matrix was acquired.
I build a deep neural network architecture using Keras and Tensorflow to provide water quality forecasting (ARIMA, Convolutional Neural Network, Long Short-Term Memory, K Nearest-Neighbor KNN et support Vector machines SVM).

Languages: French (fluent), English (professional), Italian (basic)

Trainings & Certifications
2019: TALEND BIG DATA Advanced
2017: COGNOS
2016: Certification on Information Systems Development and Organization Analysis at Lorraine University
2011: SAS V9
2004 – 2009: Ph.D. in Applied Mathematics at Rouen University
2003 – 2004: Master’s Degree in Stochastic Modeling and Analysis at Rouen University
Qualifications
Data Visualization and Communication
Datamining: Neural network, survival analysis, cluster analysis, classification methods, Bayesian networks
Project Management
Working with unstructured data
Technical Background
Operating Systems: Windows 95/XP, UNIX AIX V6
RDBMS: MySQL, Oracle Database, MongoDB
Reporting Tools: R Markdown, R Shiny, SAS Visual Analytics, Python, Cognos
Web Development: Programmation en langage Java, PHP, HTML5, Angular JS
Project management
Financial management: Analysis of project needs, Establishment and proposal of possible scenarios
Project management: Strategic benchmarking, intellectual property, innovation strategy

Expériences professionnelles

Expérience professionnelle

Since March 2020: Scientific collaboration
Fonction
Data scientist
Project
Water quality prediction using machine learning methods
Statistical tools
R Studio (tseries, forecast, tidyverse, ggplot2), Python (pandas, sklearn, svm, numpy, ARIMA, SARIMA, torch)

Mission
Real-time data points collected by IoT devices. They were cleaned by performing a box plot analysis for outlier detection.
To find the dependent variables, I performed correlation analysis to extract the possible relationships between the parameters, and Pearson correlation coefficient matrix was acquired.
I build a deep neural network architecture using Keras and Tensorflow to provide water quality forecasting (ARIMA, Convolutional Neural Network, Long Short-Term Memory, K Nearest-Neighbor KNN et support Vector machines SVM).

Since August 2017 today : Lead data Scientist in Centre de Gestion de Meurthe-et-Moselle

Fonction
Lead data scientist
Statistical tools
R Studio, R Shiny, R markdown, R Interface for Multidimensional analysis of Texts and Questionnaires» (IRaMuTeQ) software, Cartography (SIG), TALEND for Business Intelligence, COGNOS, Sql

Mission

Collaborating with multiple departments to uncover information hidden in various content and data sources, helping them make smarter decisions to deliver better results.
Explain my findings to a non-technical audience and coaching them to address their concerns.
I supervised a master II internship in organizational psychology at Lorraine university. The internship scope is exploring perceptions of the work environment among teaching assistant by transforming natural language data into useful features using NLP techniques to feed classification algorithms (tokenisation, term-document matrix, CleanNLP, AFCM).
Development of methodology, collection tools, and realization of an analytical plan for the processing and analysis of surveys: sample size, correction of non-response.
Building experience tables based on the study of the relevant portfolio in order to know the impact caused by their possible uses on the tariffs and reserving. The objective of this project is to build three types of tables: a temporary disablement entry table, a temporary disablement recovery table and a transition table to permanent disablement.
Analysis were performed using R software (Kaplan-Meier estimators, Whittaker-Henderson, recurrent time events).
Management and processing of data within the framework of statistical studies relating to absenteeism and the evolution of the state of health of employers.
Data were extracted from employee management software and medicine database using Sql and Cognos. Analysis was performed using R (imputed missing data, parametric and nonparametric tests, ANOVA, PCA).
Use of datamining for the creation of alert and monitoring indicators: Classification, clustering and association rules.
Predictive analytics of the evolution of departmental territorial public employment (SMART mobility).

November 2016 – July 2017: Scientific collaboration (Lorraine university)

Fonction
Data Scientist
Project
Management of aquatic resources: The effect of pesticides on water quality
Statistical tools
R

Mission
Evaluation of the effect of the mixture of pesticides on the growth and reproduction of daphnia.
Data analyses were conducted within the R statistical environment (Survival analysis, parametric and nonparametric tests, classification tree).

Jun 2016 - October 2016: LASER ANALYTICA  
Fonction
Statistical analyst
Project
Improved quality of life in metastatic pancreatic cancer patients receiving liposomal irinotecan+5-
FU/LV – post-hoc analysis of phase 3 trial data
Statistical tools
SAS, R, Sql

Mission
Quantitative analysis of the questionnaire EORTC-QLQ-C30.
Management of missing data (multiple imputation, regression, nearest neighbor method, Bayesian inference).
Principal component analysis PCA with rotation (VARIMAX), and survival analysis.
GLM, mixed model, parametric and non-parametric test.
Programming of statistical tables, graphs and validation of results.
Writing of statistical reports.
Bibliographic research and Benchmarking.
Writing research paper and posters.

January 2016 – Jun 2016 : Scientific collaboration
Fonction
Data Scientist
Project
Effects of cover biomass and soil fertility management practices on weed flora
Statistical tools
R

Mission
Principal components analysis (PCA).
Ascending hierarchical classification (AHC).
Analysis of variance ANOVA.
GLM, mixed model, Supervised and unsupervised learning model.
Writing research paper.

September 2012 - October 2014: Elmhurst College USA
Fonction
Statistical analyst

Project
Analyzed preclinical cancer data on the combination of oncolytic viruses with radiotherapy and
chemotherapy in a phase II
Statistical tools
R, SAS V9

Mission
Organization and orientation of the problem-solving strategy
Analysis of data from the living
Cox model, Survival analysis
Estimation of statistical models (linear models, ANOVA, parametric and non-parametric estimation)
Drafting of the Detailed Technical Specifications (STD) document
Development of SAS programs according to specification

October 2010 - Jun 2012: Department of Applied Mathematics at UCO university
Fonction
Assistant professor
Statistical tools
R, SAS

Mission
Participation in the development of the course model.
Planning of team tasks and tasks.
Progress monitoring and estimation of the rest to be done.
Supervising Bachelor of Master degree to prepare graduates for challenging careers in financial services and banking.
Teaching courses and seminars, and some administrative duties.
Contribute to identifying project grants and funding opportunities.