Hi, I’m Luca

Hi! I am Luca Rossi, a Machine Learning Researcher based in Paris. My work focuses on the evaluation of generative models, spanning from zero-shot learning to trajectory prediction, and more recently, large language models. I am particularly interested in exploring the capabilities of LLMs, understanding their potential applications, as well as their societal and economic impacts, their limitations, and their risks.

About Me

I completed my Ph.D. in Machine Learning in March 2024. My research primarily focused on developing evaluation methods for generative models. I specialized in assessing the generalizability and robustness of zero-shot learning models within the computer vision domain. My earlier projects explored trajectory prediction and various other applications in computer vision.

Until April 2024, I was part of the ML research team at Giskard, a French startup dedicated to benchmarking and evaluating ML models. My work was focused on LLM safety, and I fine-tuned models to run safety evaluations, achieving GPT-4-level accuracy in some out-of-distribution scenarios. I also contributed to the open-source library.

Selected Projects

Below are a selection of my projects. For a complete list of my projects and publications, refer to my GitHub and Google Scholar profiles.

Generalizability and robustness in Zero-Shot Learning
- If you tell a child that a zebra is a horse with stripes, they will be able to recognize a zebra even if they’ve never seen one before. Zero-shot learning (ZSL) aims to teach machines to do the same.
- My research introduced novel methods to assess the generalizability and robustness of ZSL models under varied training conditions and semantic space configurations.
- Read paper
- View code
Analogy-Based Zero-Shot Learning
- ZSL models are trained on synthetic data generated “bottom-up”. A generative model learns to map semantic attributes to features, which are assembled into synthetic images akin to Lego blocks.
- The hypothesis behind this project is that classes share more similarities than their semantic attributes suggest. By leveraging analogies between classes, we can synthesize more realistic data and improve ZSL capabilities.
- View code
P(IK) - Evaluating LLM calibration
- This project builds upon the work of Anthropic (2022) and Burns et al. (2022) to explore if LLMs can accurately state their confidence in their answers P(IK) (probability of “I know”).
- A linear probe is trained on hidden activations to predict whether the model knows the answer to a question.
- View code
Human trajectory prediction with LSTMs and GANs
- This project builds upon the work of Alahi et al. (2016) and Gupta et al. (2018) on predicting human trajectories in crowded environments.
- This project addresses the problem of multimodality, i.e. the existence of multiple plausible future paths for a given past trajectory. Three contributions are made: more diverse datasets, new evaluation metrics, and new generative models.
- Read paper
- View code
Crypto prediction market arbitrage bot
- The PancakeSwap exchange offers a prediction market where users can bet on whether the price of a given cryptocurrency will go up or down in the next 5 minutes.
- This bot exploits an arbitrage opportunity by leveraging the wisdom of the crowd when it is not properly reflected in the pool odds. This strategy worked in the past, but the market has since corrected the inefficiency. This is for educational purposes only, and using it with real money will result in financial loss.
- View code
ProM plugins for process repairing
- ProM is a platform for process mining, a field that aims to extract knowledge from event logs to improve business processes.
- This project includes three plugins that allow users to update and repair process models with an intuitive interface.
- View code

Connect with Me

I encourage you to reach out if you’re interested in my work or have questions.

Here is a duck:

Duck