Specialized in ML and NLP, I bring State of the Art models from PoC to production.
ML, NLP, LLM, Pytorch, Tensorflow, ONNX, Hugging Face, Spark ML
Skypilot, DVC, BentoML, ClearML, Mlflow
Python, PEP8
Dash, Streamlit, Tableau, Kibana
Docker, Gitlab CI, Kubernetes, Helm, Argo CD
Hadoop, Spark, Snowflake, Elastic Search, Airflow
S3, Lambda, DynamoDB, ECR, EKS...
I'm a Machine Learning Engineer with a strong mathematical background. My path to Data Science began when I realized the vast potential of applying mathematical concepts to real-world challenges.
I particularly improved my skills at Ubisoft, where I spent three years deploying Machine Learning models to detect in-game fraudulent transactions. This experience taught me how to manage end-to-end Machine Learning projects.
Then, I joined GitGuardian as a Machine Learning Engineer, where we help companies secure their infrastructure thanks to secrets detection in source code, Infrastructure as Code vulnerabilities detection and Software Component Analysis. I am working on fine-tuning Large Language Models to detect secrets and deploy them in our AWS infrastructure. I also did PoCs to automatically replace hardcoded secrets in code with environment variables (OpenAI API vs AST then code formatting like Black).
In the meantime, I am developing personal projects. You can also find me playing Football every week, going to the Gym, reading books or hanging out with friends in Paris or elsewhere. If you want to have a talk about some projects, please contact me on Linkedin or at michael.romagne@gmail.com.
Improving the Secrets Detection Engine by fine-tuning LLMs and deploying them on AWS.
Developed PoCs on automatic remediation for leaked secrets (OpenAI API, Black AST and code formatting).
- End-to-end Fraud Detection project in e-commerce transactions (Ubi Connect and Steam).
Led Research tasks (Feature Engineering, Semi-supervised learning) and put in place MLOps best practices (DVC, remote jobs on K8s, ClearML, model inference on AWS).
The project led to 5% of net sales savings, about 4 millions euros per year, compared to the previous fraud detection product.
- Time Series forecasting on Acquisition, Retention, Monetization and Ubisoft servers vCPU usage. Trained and deployed Generalized Additive Models to improve forecasts.
Developed a Streamlit application to help regions and cities to identify if there is a lack of infrastructures to charge electric cars in their area and take decisions. GitHub repo and Web app.
Research on Digital Twins to optimize IoT Systems. Data Science, Simulation and Monitoring of IoT systems.
NLP on Orange mobile phone and internet boxes logs to predict churn and customer satisfaction.