Fine-tuning language models from human preferences

AI safety needs social scientists

Reward learning from human preferences and demonstrations in Atari

AI safety via debate

Deep network guided proof search

DeepMath - Deep sequence models for premise selection

TensorFlow: A system for large-scale machine learning

TensorFlow: Large-scale machine learning on heterogeneous distributed systems

Pentago is a first player win: strongly solving a game using parallel in-core retrograde analysis

A deterministic pseudorandom perturbation scheme for arbitrary polynomial predicates