Skip to content

NevarokML: UNevarokMLRewardUtils API

The UNevarokMLRewardUtils class provides utility functions for working with rewards in NevarokML.


Methods

BoolReward

UFUNCTION(BlueprintPure, Category = "NevarokML|RewardUtils")
static float BoolReward(float trueReward = 1.0f, float falseReward = -1.0f, bool value = false);
Calculates the reward based on a boolean value. If the value parameter is true, the function returns trueReward, otherwise it returns falseReward.

Curve01Reward

UFUNCTION(BlueprintPure, Category = "NevarokML|RewardUtils")
static float Curve01Reward(const UCurveFloat* curve, float value, float min, float max, float rewardMultiplier = 1.0f);
Calculates the reward based on a value and a curve that maps the range [0, 1] to a reward range. The function evaluates the curve at the normalized value of value within the range [min, max] and multiplies it by rewardMultiplier.

StepsCurve01Reward

UFUNCTION(BlueprintPure, Category = "NevarokML|RewardUtils")
static float StepsCurve01Reward(const UCurveFloat* curve, int timeSteps, int maxTimeSteps = 100, float rewardMultiplier = 1.0f);
Calculates the reward based on the progress of a sequence of time steps. The function evaluates the curve at the normalized progress of timeSteps within the range [0, maxTimeSteps] and multiplies it by rewardMultiplier.

StepsDoneCurve01Reward

UFUNCTION(BlueprintPure, Category = "NevarokML|RewardUtils")
static float StepsDoneCurve01Reward(const UCurveFloat* curve, int timeSteps, int maxTimeSteps = 100, bool isDone = false, float rewardMultiplier = 1.0f);
Calculates the reward based on the progress of a sequence of time steps and a completion flag. If isDone is true, the function calculates the reward using the StepsCurve01Reward function with the given parameters. If isDone is false, the function returns 0.0f.

DotProductCurveReward

UFUNCTION(BlueprintPure, Category = "NevarokML|RewardUtils")
static float DotProductCurveReward(const UCurveFloat* curve, FVector a, FVector b, float rewardMultiplier = 1.0f);
Calculates the reward based on the dot product between two vectors and a curve that maps the dot product range to a reward range. The function normalizes the input vectors, computes the dot product, evaluates the curve at the dot product value, and multiplies it by rewardMultiplier.