Expérience professionnelle
Big Data/Cloud solution architect
Société Générale – Hybrid platforms engineering team
10/2019 – Pesent
Build of a cloud native data platform (azure, aws, on prem) on top
of Kubernetes (k8s/aks/eks, argo/helm, spark, airflow, superset,
zeppelin…)
Build of the on premise and private cloud big data offer and full
automation of all the product implementation (HDP, Cloudera
CDP, terraform, ansible, Kubernetes)
Design and implementation of data security and resiliency (Data
encryption, High availability, and Disaster recovery solution)
Big Data Solution architect
AXA - Data Innovation Competency Center
03/2017 – 09/2019
Design and Implementation of Big Data infrastructures (cluster
design, HA, encryption at rest, encryption at transit, authentication)
Integrationof analytics anddata sciences features (spotfire through
hive and impala, jupyterhub, rstudio, trifacta)
Data Flow Design & Implementation (flume, sqoop, kafka)
BI/Big data Senior consultant
La Banque Postale
03/2012 – 02/2018
As big data technical leader, I'm in charge of conceiving and
implementing big data platform offer to move from sas/datastage
to Hadoop (Cloudera CDH, spark, hdfs, hive, impala)
I was also in charge of BI architecture issues and level 3
supports todev teams and business users on the reporting and
analytics platform (SAPBO,QlikView, Datastage).
Solution Architect BI / ETL
LMG insurance
04/2010 – 03/2012
Defining the BI roadmap and architecture of the platform
(Informatica 8.1, SAP BO XI, Qlikview 9)
Support to business teams and animation of local experts
workshopsandAuditandPerformancetuning(Informatica, Oracle,
Aix)
Senior BI Consultant
Capgemini technology services
12/2004 – 03/2010
Design and implementation of multiple BI projects for our
customers (EDF, BNP Paribas, BMS, Caisees d'epargnes, Mairie
de Paris) in multiple technologies platforms (Datastage, SAP
BO, Oracle, Teradata...)
Level 3 support, Audit and Performance tuning
Société
Générale
10/2019-
Present
data platform engineer
Hybrid platforms engineering team
As data platform engineer, I was in charge of conceiving and automating a cloud native data platform
deployed on top of kubernetes (aws, azure, k8s, spark, airflow, trino, superset, zeppelin)
RESPONSABILITIES:
Design, automation and Implementation of the platform
- Design of the platform based on following principals (cloud native, cloud agnostic, data mesh, open
source, based on cloud managed services for storage and containerization)
- Build of the platform based on following components:
- service layer: spark, spark operator, apache zeppelin, apache superset, airflow
- compute layer: K8S, AKS, EKS
- data layer: S3, ADLS, HDFS
- observability: Prometheus, Grafana, loki, thanos
- Implementation of CI/CD solution based on helm charts, argocd, argoworkflow and Jenkins
- Enhancement of the solution based on project feedbacks:
- cost optimization: autoscaling, spot resources, conf tunning
- scheduling: Yunikorn
- security: managed identies, managed certificates, ing, private dns
TECHNOLOGIES:
- Azure (aks, adls,sep, managed identies, vault, private dns managed certificates )
- AWS (eks, s3, vpc, service endpoints, emr containers)
- Kubernetes (helm charts, workflow templates, deployments, ing, svc, pod binding…)
- open-source solutions: spark, spark operator, apache zeppelin, apache superset, airflow
Big Data/Cloud Senior Devops
Site Reliability Engineering team – big data
As Devops and SRE member, I was in charge of conceiving and automating Data infrastructures and
products to move from legacy on premise HDP clusters to hybrid big data offer (on premise and private
cloud):
RESPONSABILITIES:
Design, automation and Implementation of Data infrastructures and products to move from legacy on
premise HDP clusters to hybrid data offer (on premise and private cloud):
- Automation of cluster deployment on internal private cloud
- Automation of all loadbalancing Resources using Private Cloud API (Load Balancers, VIPs, Listeners,
Pools, Healthchecks + DNS Aliases)
- Automation of SSL / JKS certificates creation & deployment
- Automation of CDP DataCenter & Cloudera Management Services installation & deployment (fully
Kerberised and with TLS / SSL enabled)
- Automation of CDP Virtual Private Cloud - Kafka (Compute Cluster)
- Integration of the new solution to existing Cloud Platform tools (Authentication, Monitoring, Vault)
- Automation of configuration changes to the Resource Plane
- Move from on premise to Kubernetes for scheduling and monitoring tools (airflow, Prometheus,
Grafana)
- Formation of production teams to use the new platform and the new tools
Design, automation
Design, automation Design and implementation of data resiliency:
- Disaster Recovery Study (hdfs, hive, hbase and kafka)
- DR implementation (data replication tooling and orchestration through airflow)
TECHNOLOGIES:
- platforms: CDP Data Center 7 (HDFS, YARN, HBase, Hive, Oozie, Hue, Spark, Kafka, Ranger, Ranger
KMS, Knox), HDP 2.6
- Devops : Terraform 0.13, Ansible, Airflow,Github
AXA
04/2017-
10/2019
Big Data Solution Architect
Big Data Competency Center – Data Innovation Lab, France (Suresnes)
As big data solution architect, I was in charge of conceiving and implementing big data platforms for all
axa entities, integration of analytics and data sciences features and data flow design & implementation
RESPONSABILITIES:
Design and Implementation of Big Data infrastructures
▪ Cluster design and implementation for All axa entities (3 regions: Europe/Middle East, US,
ASIA and 35 Entities)
▪ Troubleshooting of day to day issues to take the clusters to full production status
▪ YARN design and implementation for multi-tenancy and performance
▪ High Availability implementation for HDFS, HIVE and Impala
▪ Disaster Recovery for HDFS and Hive metadata replication, Kafka MirrorMaker
replication across sites.
▪ Integration of services such as Flume, Kafka, Cloudera Navigator, Spark2
▪ Implementation of security by introducing:
o Kerberos authentication across all cluster services
o LDAP and active directory integration
o Design of User Access and multitenancy to Hadoop services based on ldap groups
o Encryption at transit and Encryption at rest implementation
▪ Dev/Ops: config management through ansible, automation through airflow
Integration of analytics and data sciences features
▪ Implementation and integration of jupyterhub and rstudio for datascience teams
▪ Integration of spotfire through impala and hive jdbc access
▪ Study and poc for new capabilities (datawrangling - trifacta, streamsets, apache nifi)
Data Flow Design & Implementation:
▪ Data ingestion through Sqoop, Flume from source systems: SGBD (Oracle, Mysql,
Postgres), File sources, kafka
▪ Hive data structure design (parquet file format, orc/transactional tables, partitions, buckets
and query optimization)
Technologies:
- Cloudera CDH 5
- Hive, Impala, Oozie, Sqoop, Flume, Spark, kafka
- jupyterhub, rstudio, spotfire
- Shell Scripting, Python, Pyspark
La Banque
Postale
04/2012-
03/2017
Big Data & business intelligence technical leader
Reporting and Analytics Experts team, France (Ivry sur seine)
As big data technical leader, I was in charge of conceiving and implementing a solution capable of ingesting
up to 10TB of data on a daily basis. I was equally in charge of implementing the first analytics pocs based
on business requirement (scoop, flume, hive, impala, R)
As analytics technical leader I was in charge of level 3 supports to dev teams and business users on
the reporting and analytics platform (SAP BO, QlikView).
RESPONSABILITIES:
▪ Defining the analytics roadmap and architecture of the platform.
▪ Install, deploy, configure and tune HDP, HDF cluster and cloudera CDH for a secure and
multi-tenant cluster
▪ Tune YARN depending on the resources available for the cluster
▪ Enable and configure dynamic resource management to run MapReduce, Impala and
▪ Spark on top of YARN
▪ Enable HDFS and YARN High availability
▪ Configure, tune and deploy Flume agents
▪ Enable Spark in client and cluster mode on top of YARN
▪ Developing the Ingestion module (Shell scripting, Sqoop, Flume, Hive, Python) for the first use
cases
▪ Integrating Machine Learning algorithms within the platform (SparkMlib, R, Python, Talend)
Integrating data visualisation tools within the platform (qlik sense and qlikview)
▪ Working closely with business users to gather and verify reporting requirements.
▪ Developing prototypes for SAP BO and QlikView projects
▪ Level 3 support for reporting and analytics tools (SAP BO, QlikView)
▪ SAP Business Object XI Administration
▪ Write and review technical knowledge base articles, solutions, and how-to guides for
publication to users and company knowledge system.
TECHNOLOGIES:
- Amazon AWS, Hortonworks HDP, HDF, Cloudera CDH 4, CDH 5
- Hive, Impala, Oozie, Sqoop, Flume, Spark
- QlikView, Qlik Sense, SAP BO XI, Talend
- Shell Scripting, Python, R
La Mutuelle
Générale
04/2009-
03/2012
BI / ETL Solution Architect
Back Office department, France (Vincennes)
BI/DATA VISUALISATION ARCHITECT
▪ Defining the BI roadmap and architecture of the platform.
▪ Design and implementation of BI architecture (SAP BO XI3.1, QlikView 9.0 SR2)
▪ Upgrade from SAP BO V6.1 to BO XI 3.1
▪ Working closely with business users to gather and verify reporting requirements.
▪ Web Intelligence and universe designer training and support
▪ Level 3 support to business teams and animation of local experts workshops
BNP
Paribas
Assurance
05/2006-
03/2009
Data Warehouse and BI Senior consultant BI
services department, France (Rueil)
DESIGN AND IMPLEMENTATION OF THE DWH AND THE RISK DATAMARTS
▪ Develop, enhance and maintain an end-to-end solution for Data warehouse
Operations
▪ Extend DW to incorporate rapidly changing BNP business and growth
▪ Use of Informatica PowerCenter 6.2 for ETL pipelines
▪ Performance tuning (mapping optimisation, sources and targets, sessions,
memory...)
▪ Working closely with business users to gather and verify reporting requirements
▪ Designing, developing, implementing, supporting and documenting Business Objects related
applications
▪ Upgrade from BO 5.1 to BO 6.5
▪ Analysing and resolving performance issues (Oracle, informatica, BO)
TECHNOLOGIES:
Informatica v6.1, BO v5.1, BO 6.5, Oracle 9i, Unix
Caisses
d’Epargnes
(Banking
Services)
11/2005-
04/2006
Senior ETL/DWH Consultant
Information System department, France(Malakoff)
S’MILES PROJECT
DESIGN AND IMPLEMENTATION OF THE MARKETING DATAMART.
▪ Involvement in the analysis, design and development of all the interfaces using
Datastage
▪ Creation of jobs pipelines from various systems (CRM, OLTP) to transaction history
database (Reporting Database).
▪ Creation, maintenance and updating of ETL technical documentation.
▪ Member of the development team for shell scripts automating and converting data from
ODS to Data warehouse and data mart.
▪ Analysing and resolving performance issues
TECHNOLOGIES:
DataStage v7.5, Teradata 6.2, Unix
EDUCATION
MSc Engineering degree
Telecom SudParis
SKILLS & COMPETENCES
Logicalandanalyticalabilities
Goal Oriented
Devops approach
Kubernetes
HadoopEcosystem
Azure
Aws
terraform
ACHIEVEMENTS & CERTIFICATES
CKAD: Certified Kubernetes Application Developer
AWS Architect Associate
AWS analytics specialist
Microsoft Azure admin
Azure data fundamentals
Terraform associate certified
Hadoop administration level2
LANGUAGES
French English Arabic