Human-in-the-loop RL for Behaviour Change

cover

Supervisor - Mina Khan

Members - Advait Rane, P Srivatsa

I am working on the Project PAL in the Fluid Interfaces Group at MIT Media Lab for my undergraduate thesis.

Habit formation can be supported with computer-generated interventions. However, for these interventions to be helpful they should be personalised and context-specific. For a healthy interaction with the user the system should learn to adapt to the user fast, and the user should be able to understand and control its behaviour. Thus, sample-efficiency and interpretability are key to the system.

As a part of this thesis, we relied on Reinforcement Learning (RL) to learn the most beneficial interventions. We built an RL environment and designed datasets to test the above desiderata in an RL model. We evaluated the use of different sequence models to learn user behaviour patterns to make learning sample-efficient. We further evaluated the performance of different RL algorithms to determine the best choice in terms of sample efficiency and integration of human guidance. To test the models, we used computer-usage data to learn to give calming interventions during computer usage. We also built and deployed a Chrome extension which gives calming interventions based on the user’s preferences while browsing the internet.

The image below depicts a visualisation of the computer usage data.

computer usage states

For details, you can refer to my thesis.