l5kit.environment.reward module¶

class l5kit.environment.reward.L2DisplacementYawReward(reward_prefix: str = 'L2DisplacementYaw', metric_set: Optional[l5kit.cle.metric_set.L5MetricSet] = None, enable_clip: bool = True, rew_clip_thresh: float = 15.0, use_yaw: Optional[bool] = True, yaw_weight: Optional[float] = 1.0)¶

Bases: l5kit.environment.reward.Reward

This class is responsible for calculating a reward based on (1) L2 displacement error on the (x, y) coordinates (2) Closest angle error on the yaw coordinate during close loop simulation within the gym-compatible L5Kit environment.

Parameters

reward_prefix – the prefix that will identify this reward class
metric_set – the set of metrics to compute
enable_clip – flag to determine whether to clip reward
rew_clip_thresh – the threshold to clip the reward
use_yaw – flag to penalize the yaw prediction
yaw_weight – weight of the yaw error

get_reward(frame_index: int, simulated_outputs: List[l5kit.simulation.unroll.SimulationOutputCLE]) → Dict[str, float]¶

Get the reward for the given step in close loop training.

Parameters

frame_index – the frame index for which reward is to be calculated
simulated_outputs – the object contain the ego target and prediction attributes

Returns

the dictionary containing total reward and individual components that make up the reward

reset() → None¶: Reset the closed loop evaluator when a new episode starts.

reward_prefix: str¶: The prefix that will identify this reward class

static slice_simulated_output(index: int, simulated_outputs: List[l5kit.simulation.unroll.SimulationOutputCLE]) → List[l5kit.simulation.unroll.SimulationOutputCLE]¶

Slice the simulated output at a particular frame index. This prevent calculating metric over all frames.

Parameters

index – the frame index at which the simulation outputs is to be sliced
simulated_outputs – the object contain the ego target and prediction attributes

Returns

the sliced simulation output

class l5kit.environment.reward.Reward¶

Bases: abc.ABC

Base class interface for gym environment reward.

abstract get_reward(frame_index: int, simulated_outputs: List[l5kit.simulation.unroll.SimulationOutputCLE]) → Dict[str, float]¶

Return the reward at a particular time-step during the episode.

Parameters

frame_index – the frame index for which reward is to be calculated
simulated_outputs – the object contain the ego target and prediction attributes

Returns

reward at a particular frame index (time-step) during the episode containing total reward and individual components that make up the reward.

abstract reset() → None¶: Reset the reward state when new episode starts.

reward_prefix: str¶: The prefix that will identify this reward class