I am a research scientist at the IBM Research Europe (Dublin Research Lab) working in the areas of analytics, datamining, machine learning, reinforcement learning, automated decisions, and AI.
I created and maintain the following Julia packages:
14:45 UTC
Unlike in Online RL where agents need to interact with real environment, Offline RL works similar to a typical machine learning workflow. Given a dataset, Offline RL processes data extracting state, action, reward, and terminal columns to optimize the policy Q. By wrapping up offline RL into the AutoMLPipeline workflow, it becomes trivial to search for the optimal preprocessing elements and their combinations to improve Offline RL optimal policy using symbolic workflow manipulation.