OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents Paper • 2506.14866 • Published Jun 17 • 6
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning Paper • 2402.04833 • Published Feb 7, 2024 • 5