Task Loaders#

Task loaders manage task sampling and distribution for training and evaluation. They provide deterministic, epoch-based iteration with state management for checkpoint recovery.

Core Concepts#

Infinite Mode (infinite=true, default): Tasks loop indefinitely using modulo-based indexing. Task order is consistent within a run but doesn't reset between epochs.

Finite Mode (infinite=false): Tasks are reshuffled at epoch boundaries. Enables proper multi-epoch training with controlled curriculum.

Epoch: One complete pass through all tasks. In finite mode, epochs trigger automatic reshuffling.

Configuration#

task_loader:
  _target_: agoge.task_loader.TaskListLoader
  tasks: [...]
  infinite: false  # Enable epoch-based iteration
  seed: 42         # Reproducible task order across epochs
  name: null       # Auto-generated identifier

Parameters#

infinite (bool): If true, loop tasks indefinitely. If false, reshuffle at epoch boundaries.
seed (int | None): Random seed for task shuffling. Required for reproducibility in finite mode.
name (str | None): Loader identifier for logging. Auto-generated if not provided.

Usage Patterns#

RL Training (Infinite Mode)#

# configs/rl.yaml
task_loader:
  infinite: true
  seed: null  # No shuffling needed

num_episodes: 1000  # Stop after N episodes

Tasks loop continuously. Suitable for exploration-heavy training where task order matters less than coverage.

RL Training (Epoch-based)#

# configs/rl.yaml
task_loader:
  infinite: false
  seed: 42  # Reproducible epochs

num_episodes: 5000  # Will span multiple epochs

With 1000 tasks and 5000 episodes, this triggers 5 epochs with reshuffling between each. Enables curriculum learning and deterministic training.

Evaluation#

# configs/eval.yaml
task_loader:
  infinite: false
  seed: 42

num_episodes: null  # Auto-detect from task loader length

The eval entrypoint auto-detects episode count from len(task_loader) for single-pass evaluation.

State Management#

All task loaders support state save/load for checkpoint recovery:

# Save state
state = task_loader.save_state()
# Returns: {
#   "_target_": "agoge.task_loader.base.TaskListLoader",  # Class path for reconstruction
#   "epoch": 2,
#   "iteration": 5,
#   "seed": 42,
#   "name": "TaskListLoader_123456",
#   "infinite": False,
#   "epoch_order": [2, 0, 4, 1, 3],
#   "rng_state": (...),
#   "instantiation_metadata": {"tasks": [...]}
# }

# Restore state into existing loader
task_loader.load_state(state)
# Resumes iteration at exact position with same task order

Full Reconstruction from State#

The saved state includes _target_ (fully qualified class path) and instantiation_metadata (constructor parameters), enabling complete loader reconstruction:

import hydra

# Reconstruct loader from saved state
reconstruction_config = {
    "_target_": state["_target_"],
    **state["instantiation_metadata"],
    "infinite": state["infinite"],
    "seed": state["seed"],
    "name": state["name"],
}
loader = hydra.utils.instantiate(reconstruction_config)
loader.load_state(state)
# Loader continues from exact checkpoint position

Note: State management infrastructure is implemented but not yet integrated with RL checkpoint system.

Available Loaders#

TaskListLoader#

Direct instantiation from task list.

task_loader:
  _target_: agoge.task_loader.TaskListLoader
  tasks:
    - {task_id: "task_0", inputs: {...}}
    - {task_id: "task_1", inputs: {...}}

FileTaskLoader#

Load tasks from JSONL file (local or GCS).

task_loader:
  _target_: agoge.task_loader.FileTaskLoader
  path: gs://bucket/tasks.jsonl

CRMArenaTaskLoader#

CRMArena benchmark tasks.

task_loader:
  _target_: agoge.task_loader.CRMArenaTaskLoader
  split: train  # or test

OSWorldTaskLoader#

OSWorld benchmark tasks.

task_loader:
  _target_: agoge.task_loader.OSWorldTaskLoader
  split: train  # or test

Epoch Events#

In finite mode, epoch resets are logged at DEBUG level:

logger.debug(
    f"TASK_LOADER_EPOCH_RESET epoch={self._epoch} num_tasks={num_tasks} "
    f"name={self._name} infinite={self.infinite}"
)

Use logging.level_overrides.agoge=DEBUG to capture these events for monitoring long training runs.

Examples#

Multi-epoch RL Training#

uv run src/agoge/entrypoints/rl.py \
  task_loader=default \
  task_loader.infinite=false \
  task_loader.seed=42 \
  num_episodes=5000

Single-pass Evaluation#

uv run src/agoge/entrypoints/eval.py \
  task_loader=mind2web \
  task_loader.infinite=false \
  task_loader.seed=42 \
  num_episodes=null  # Auto-detect

Infinite Exploration#

uv run src/agoge/entrypoints/rl.py \
  task_loader=default \
  task_loader.infinite=true \
  num_episodes=1000