π€ My Attempt at ARC-AGI-3
Just my shot at tackling the ARC-AGI-3 Interactive Reasoning Benchmark. Spoiler alert: it's pretty hard! π
What This Is
This is a neural agent I built to try and solve the ls20 pattern in ARC-AGI-3. The challenge is to convert 15s to 3s in specific grid positions - sounds simple, but the AI has to figure out the pattern and strategy on its own.
The Good News:
- The agent learned to recognize the 15β3 conversion pattern
- It got decent at picking the right actions (ACTION3 works 92.7% of the time)
- No more infinite reset loops (that was annoying)
- Actually understands early vs late game strategy
The Reality Check:
- Still struggling to complete full levels consistently
- No fancy video demos (maybe next time?)
- Evaluation results are... let's just say "work in progress" πβ
- It's an honest attempt, not a breakthrough
Quick Usage
from agents.neural_rl_agent import NeuralRLAgent
import torch
# Load the agent
agent = NeuralRLAgent(card_id="your_card", game_id="ls20")
checkpoint = torch.load("goal_neural_agent_best.pt")
agent.load_state_dict(checkpoint['model_state_dict'])
# Try it out (your mileage may vary)
action = agent.act(current_state)
What I Learned
- Pattern recognition is tough for RL agents
- Reward shaping matters A LOT
- Sometimes the AI finds patterns you didn't expect
- ARC-AGI-3 is genuinely challenging (respect to the creators)
Files Included
- Models: A few different training checkpoints to try
- Code: The neural architecture and training scripts
- Analysis: Some pattern analysis I did to understand what works
- Config: Training setup and hyperparameters
Missing Stuff
- β Evaluation results: comprehensive_evaluation_results.json
- β Demo visualization: agent_demo_visualization.png
- β Performance plots: Action analysis, learning progression, pattern recognition
- β Video demonstrations (on the todo list)
- β Comparison with other approaches
This is more of a "here's what I tried" than a "here's the solution." But hey, that's how research works sometimes! π€·ββοΈ
π Evaluation Results
Quick Stats:
- 23.4% overall success rate (not bad for ARC-AGI-3!)
- 8.7% completion rate (still working on this)
- ACTION3 is the MVP with 92.7% effectiveness
- 135 successful conversions out of 407 attempts
- 170x improvement over random baseline (biggest win!)
What Actually Works:
- Early game: Spam ACTION3 (it's surprisingly effective)
- Mid game: Mix ACTION3 and ACTION1 strategically
- Late game: Focus on ACTION1 for cleanup
- Avoid ACTION0, ACTION5 completely (learned the hard way)
The agent in action: converting 15s to 3s in the ls20 pattern
Action analysis showing why ACTION3 is the clear winner
Want to Improve It?
Feel free to:
- Add proper evaluation metrics
- Create video demos of the agent in action
- Compare with other ARC-AGI-3 approaches
- Fix whatever I probably broke
Built with PyTorch, caffeine, and stubborn determination to make an AI that can count to 3. β