Open-source metrics for ranking and recommendations.
RecallTopK
.
Recall at K reflects the ability of the recommender or ranking system to retrieve all relevant items within the top K results.
Implemented method:
PrecisionTopK
.
Precision at K reflects the ability of the system to suggest items that are truly relevant to the users’ preferences or queries.
Implemented method:
FBetaTopK
.
The F Beta score at K combines precision and recall into a single value, providing a balanced measure of a recommendation or ranking system’s performance.
Beta
is a parameter that determines the weight assigned to recall relative to precision. Beta
> 1 gives more weight to recall, while beta
< 1 favors precision.
If Beta
= 1 (default), it is a traditional F1 score that provides a harmonic mean of precision and recall at K. It provides a balanced estimation, considering both false positives (items recommended that are not relevant) and false negatives (relevant items not recommended).
Range: 0 to 1.
Interpretation: Higher F Beta at K values indicate better overall performance.
MAP
.
MAP (Mean Average Precision) at K assesses the ability of the recommender or retrieval system to suggest relevant items in the top-K results, while placing more relevant items at the top.
Compared to precision at K, MAP at K is rank-aware. It penalizes the system for placing relevant items lower in the list, even if the total number of relevant items at K is the same.
Implemented method:
MAR
.
MAR (Mean Average Recall) at K assesses the ability of a recommendation system to retrieve all relevant items within the top-K results, averaged by all relevant positions.
Implemented method:
NDCG
.
NDCG (Normalized Discounted Cumulative Gain) at K reflects the ranking quality, comparing it to an ideal order where all relevant items for each user (or query) are placed at the top of the list.
Implemented method:
HitRate
.
Hit Rate at K calculates the share of users or queries for which at least one relevant item is included in the K.
Implemented method:
MRR
Mean Reciprocal Rank (MRR) measures the ranking quality considering the position of the first relevant item in the list.
Implemented method:
ScoreDistribution
This metric computes the predicted score entropy. It applies only when the recommendations_type
is a score.
Implementation: