πŸ€– My Attempt at ARC-AGI-3

Just my shot at tackling the ARC-AGI-3 Interactive Reasoning Benchmark. Spoiler alert: it's pretty hard! πŸ˜…

What This Is

This is a neural agent I built to try and solve the ls20 pattern in ARC-AGI-3. The challenge is to convert 15s to 3s in specific grid positions - sounds simple, but the AI has to figure out the pattern and strategy on its own.

The Good News:

  • The agent learned to recognize the 15β†’3 conversion pattern
  • It got decent at picking the right actions (ACTION3 works 92.7% of the time)
  • No more infinite reset loops (that was annoying)
  • Actually understands early vs late game strategy

The Reality Check:

  • Still struggling to complete full levels consistently
  • No fancy video demos (maybe next time?)
  • Evaluation results are... let's just say "work in progress" πŸ“Šβ“
  • It's an honest attempt, not a breakthrough

Quick Usage

from agents.neural_rl_agent import NeuralRLAgent
import torch

# Load the agent
agent = NeuralRLAgent(card_id="your_card", game_id="ls20")
checkpoint = torch.load("goal_neural_agent_best.pt")
agent.load_state_dict(checkpoint['model_state_dict'])

# Try it out (your mileage may vary)
action = agent.act(current_state)

What I Learned

  • Pattern recognition is tough for RL agents
  • Reward shaping matters A LOT
  • Sometimes the AI finds patterns you didn't expect
  • ARC-AGI-3 is genuinely challenging (respect to the creators)

Files Included

  • Models: A few different training checkpoints to try
  • Code: The neural architecture and training scripts
  • Analysis: Some pattern analysis I did to understand what works
  • Config: Training setup and hyperparameters

Missing Stuff

This is more of a "here's what I tried" than a "here's the solution." But hey, that's how research works sometimes! πŸ€·β€β™‚οΈ

πŸ“Š Evaluation Results

Quick Stats:

  • 23.4% overall success rate (not bad for ARC-AGI-3!)
  • 8.7% completion rate (still working on this)
  • ACTION3 is the MVP with 92.7% effectiveness
  • 135 successful conversions out of 407 attempts
  • 170x improvement over random baseline (biggest win!)

What Actually Works:

  • Early game: Spam ACTION3 (it's surprisingly effective)
  • Mid game: Mix ACTION3 and ACTION1 strategically
  • Late game: Focus on ACTION1 for cleanup
  • Avoid ACTION0, ACTION5 completely (learned the hard way)

Agent Demo The agent in action: converting 15s to 3s in the ls20 pattern

Performance Dashboard Action analysis showing why ACTION3 is the clear winner

Want to Improve It?

Feel free to:

  • Add proper evaluation metrics
  • Create video demos of the agent in action
  • Compare with other ARC-AGI-3 approaches
  • Fix whatever I probably broke

Built with PyTorch, caffeine, and stubborn determination to make an AI that can count to 3. β˜•

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading