Experience
Big Data Expert/ Freelance, Paris
SINCE JUNE 2023
As a Big Data Expert, I provide consultancy services related to Big Data & Cloud.
Topics: Data Architecture, Data Governance, Data Engineering, Data Warehousing, Data Ingestion, Security, DataOps,
Batch and Streaming Analytics, Monitoring …
Technologies: Cloudera Ecosystem (CDP, CDH, HDP), Hadoop, Spark, NiFi, Ansible, Terraform, AWS, Azure, Elasticsearch,
Kibana, Grafana, Kerberos, Ranger, TLS …
Since June 2023, I’ve been working with a customer on defining its new Big Data Platform (Platform Architecture,
migration plan, Data governance, security …)
Senior Solutions Consultant/ permanent contract, Cloudera, Paris
SINCE SEPTEMBER 2021 (1 YEAR AND 7 MONTHS)
As a technical lead, I work on some of the most exciting distributed data projects at private and public sector organizations. Engage from Proof of Concept (PoC) stages through to production implementation on complex distributed environments.
Hereunder a hint on some of my activities:
• Translate customers’ requirements into efficient, resilient and secure architectures
• Provide technical expertise on data ingestion, data engineering, data Warehousing, …
• Installation, Administration, Upgrade, Configuration of: CDP & CDH & HDP platforms (on-premise and on the cloud)
• Setup High Availability on CDP/HDP/CDH services (HDFS, YARN, Hive, Ranger, HBase, Oozie)
• Enable security on CDP/HDP/CDH clusters (Kerberos, LDAP, Ranger, MIT KDC, FreeIPA, Active Directory, TLS, Encryption at Rest)
• Troubleshoot and debug major issues on platform & services
• Audit and tune platform services
• Design disaster recovery architecture (data replication, RTO, RPO)
• Prepare and share engagement reports with customers
• Advise and share best practices with customers
• Lead POCs (Proof of Concepts) with customers/Prospects
Sectors: Banking, Insurance, Telecommunications, Construction, Automotive, Public services
Region: mainly France, but also Spain, Italy and Switzerland
Key words: CDP/HDP/CDH ecosystem (HDFS, YARN, Hive, Kafka, Hue, SRM, SMM, Spark…), CFM (NiFi, NiFi Registry), Ansible, Microsoft Azure, Amazon Web Services (AWS) …
University Teacher/ contractor, Université Paris 1 Panthéon-Sorbonne, Paris
FROM OCTOBER 2021 – DECEMBER 2021 (1 SEMESTER)
As a temporary teacher, I give courses related to Big Data technologies such as Apache Spark (PySpark) and Elasticsearch/Kibana
University Teacher/ contractor, Université Paris 1 Panthéon-Sorbonne, Paris
FROM OCTOBER 2021 – DECEMBER 2021 (1 SEMESTER)
As a temporary teacher, I give courses related to Big Data technologies such as Apache Spark (PySpark) and Elasticsearch/Kibana.
Senior Data Engineer/ permanent contract, SOCIETE GENERALE, Paris
DECEMBER 2016 – AUGUST 2021 (4 YEARS AND 9 MONTHS)
• Build a CloudWatch like solution for collecting all types of infrastructure data (logs & metrics) for Société Générale private cloud, from scratch
o Define the architecture (API, ingestion pipeline, data restitution methods)
o Implement Rest API for data ingestion and restitution: Flask/Python3, gunicorn, nginx, redis, celery, Postgresql
o Setup CI/CD pipeline: github, Jenkins, SonarQube, XL Deploy, pytest
o Setup backend infrastructure: Hadoop, Kafka, NiFi, Elasticsearch, Kibana, Druid, Superset, Grafana
o Collect ~3TB of data per day
• Code review / Big Data community/ innovation
o Review data ingestion pipeline with team members and other developers
o Animate Big Data community (between data engineers and data scientists)
o Help data scientists run their algorithms smoothly on our platform
o POC of new tools: opendistro, Presto, Sherlock for druid, Burrow, Jolokia
• Dev/Ops
o Build real-time and batch data pipelines using Python scripts, NiFi, Kafka, Spark (pyspark) …
o Collect data from multiple sources (database and streams, third parties APIs …) using NiFi, Logstash, Sqoop …
o Implement Batch/Real time processing use cases using Apache Spark (pyspark)
o Manage Hadoop/Kafka/Druid/Elasticsearch/NiFi clusters (upgrade, security, troubleshooting …): +200 nodes
o Build new infrastructure resources (Druid/Elasticsearch …) using Terraform and Ansible
• I participate in recruitment process by interviewing new candidates
Data Engineer/ intern, SOCIETE GENERALE, Paris
APRIL 2016 – OCTOBER 2016 (6 MONTHS)
• Managing the Hadoop cluster (based on Hortonworks Data Platform)
• Implementing new use cases using Apache NiFi (with Python and Shell scripts), Apache Pig, Spark and Oozie
• Developing dashboards using Apache Zeppelin, Spark, D3.js, ElasticSearch and Kibana.
Key words: Hadoop, HDFS, Yarn, Spark, Hortonworks Data Platform, NiFi, Pig, Hive, Oozie, Zookeeper, HA, Sqoop, Zeppelin, Python, Scala, D3.js, Elasticsearch, Kibana, OpenTSDB, Grafana, NoSQL, Shell, Unix, Scrum
Software Engineer / permanent contract, ZAGS, Tunis
JULY 2014 – AUGUST 2015 (1 YEAR & 2 MONTHS)
• Participate in project Prepared Insurance – team of 25 persons (US, India, Tunisia), SCRUM methodology
• Demo preparation for Pearl Holding, Novarica, Malakoff Médéric – team of 10 persons (France, Tunisia)
• Integrate SharePoint & Lync in « Zags_suite », backoffice product for insurance companies