Chapter 5 especially highlights that agent design is not just a matter of specifying a reward: often, rewards will do ~nothing, and the main requirement to get a competent agent is to provide good shaping rewards or a good curriculum. While the connections aren't always explicit, a knowledgeable reader can connect the academic examples given in these chapters to the ideas of specification gaming and mesa optimization that we talk about frequently in this newsletter. It then moves on to agency and reinforcement learning, covering from a more historical and academic perspective how we have arrived at such ideas as temporal difference learning, reward shaping, curriculum design, and curiosity, across the fields of machine learning, behavioral psychology, and neuroscience. The neural net that thought asthma reduced the risk of pneumonia The COMPAS controversy (leading up to impossibility results in fairness) The failure of facial recognition models on minorities This book starts off with an explanation of machine learning and problems that we can currently see with it, including detailed stories and analysis of: This is an extended summary + opinion, a version without the quotes from the book will go out in the next Alignment Newsletter. The Alignment Problem: Machine Learning and Human Values, by Brian Christian, was just released.
0 Comments
Leave a Reply. |