- chatbot using openAI’s API and langchain framework
- streamlit app to edit prompt and preprompt
- questions/answers stored in MongoDB for further analysis
- scraping excel file, stored in AWS S3, using AWS Lambda, trigger by CRON
- preprocess script trigger by a deposit of file in AWS S3
- visualization app using Streamlit (deployed on AWS EC2)
- gitlab ci/cd with serverless framework to deploy AWS services
- streamlit app creation to visualize OCR performances on invoices
- API rest conversion to gRPC
- webscraping with selenium, deployed on AWS lambda
- gitlab ci/cd with serverless framework to deploy AWS services : lambda, SQS
Use of streamlit to create an IHM used by data scientists to see an OCR pipeline.
Refactoring code to split script into gRPC microservice.
- python scripts fixing and refactoring that parse excels files on S3 (AWS), transforming
and inserting them into SQL Server db through SqlAlchemy. Increase API script speed
with multithreading
- Use of batch (AWS) services to orchestrate scripts execution.
- daily scraping script host on Heroku and data insertion in Mongo Cloud Atlas,
dashboard exposed with Flask
- FastAPI API containerized (Docker) updating
- API google sheet used to generate jsons used in gifi.fr website configuration
- QuickSight (AWS) dashboards edition
- conversion of Flask API to FastAPI, containerized with docker
- fine tuning BERT deep learning model to perform sentiment analysis on 16 languages
(HuggingFace, Tensorflow 2, Pytorch)
- models metrics tracked and stored in neptune.ai
- use of AWS lambda and API Gateway to expose APIs
- benchmark of Tesseract version and hyperparameters to perform OCR on forms
- data processing pipeline to process EEG data (pandas, numpy, multiprocessing)
- webapp application creation to display data distribution and let the user choose the
range of outliers as well as the data scaling (streamlit)
- Feasibility study to classify scientifiques papers with unsupervised and supervised
algorithms