# Content selection algorithm for rating

The videos appearing on the rating page are selected by a content selection algorithm.

This page describes the current implementation of this algorithm, and avenues for future development.

## Selecting the first video

When loading a new rating page, two videos must be chosen for comparisons. In this section, we discuss how the first of these two videos must be selected. Note that all the variables and lists discussed below depend on the logged-in contributor.

• If the rate-later list Lrate-later is not empty, then a video v is drawn uniformly randomly from this list Lrate-later, and chosen as the first video. We then move on to selecting the second video (see section below).
• Otherwise, if Lrate-later is empty, then we compute the list Lzero-rating of previously rated videos v that have been rated along none of the active quality criteria.
• If the list Lzero-rating is not empty, then a video v is drawn uniformly randomly from this list Lzero-rating, and chosen as the first video. We then move on to selecting the second video (see section below).
• Otherwise, if Lzero-rating is empty, then we compute the list Lincomplete of previously rated videos without a complete rating on all active quality criteria. These are the videos v such that there exists at least another video w and an active quality criterion q, for which the contributor has note previously rated v against w on criterion q.
• If the list Lincomplete is not empty, then we select the video v from the list Lincomplete that maximizes the average personalized score uncertainty along active quality criteria plus a random standard Gaussian noise N(0,1). We then move on to selecting the second video (see section below).
• If the list Lincomplete is empty, then no video should be loaded. The following message is displayed: "You have provided ratings for all pairs of videos you imported. Please copy-paste the URL of a new video you would like to rate. Note that you can use our browser extension to import videos effortlessly''.

## Selecting the second video

We now assume that a video on one side is chosen. This video will be referred to as the first video, and will be denoted v. We now describe the content selection algorithm selects the second video.

• If there is another rate-later video, i.e. if Lrate-later - {v} is non-empty, then draw a random bit. If the bit equals 1, then the second video w is drawn uniformly randomly from this list Lrate-later - {v}.
• Otherwise (if Lrate-later - {v} is empty or if the random bit equals 0), if there is a previously rated video without any rating along active quality criteria, i.e. if Lzero-rating - {v} is non-empty, then the second video w is drawn uniformly randomly from this list Lzero-rating - {v}.
• Otherwise, for all previously rated video w different from v, we verify if the ratings against v are incomplete, i.e. if there exists an active quality criterion q such that the contributor has not previously rated v against w on q. We denote Lincomplete(v) the list of videos w with incomplete ratings against v.
• Active learning score for second rated video selection
For all videos w of the list Lincomplete(v), we compute its active learning score. This score contains four terms. The first term is the similarity of videos v and w. The second term measures the uncertainty on video w's scores. The third term downgrades videos w that are often skipped by the contributor. Finally, the fourth term is a random noise.
• If Lincomplete(v) is empty, then we test for the existence of at least two previously rated videos u and w such that the contributor has not rated u against w on all active quality criteria.
• If two such videos exist, then we display the following message: "You have rated the video on left against all other videos in your account. Please choose one of the two following options". Below, two buttons should appear. The first says "add new rate-later videos", and links to the rate-later page. The second says "let Tournesol select new videos". If the user clicks on the second button, then we remove both videos, and we execute the content selection algorithm to select a first video (see section above).
• Otherwise, if all pairs of previously rated videos have been compared, then we display the following message: "Congratulations! You have rated all pairs of videos in your account. To continue contributing, please add video to your rate-later list", with a link to the rate-later page.

## Better active learning

The current implementation is a basic active learning algorithm, based on intuitively appealing principles. However, better ideas are needed to improve the content selection algorithm for rating. Below, we describe two interesting ideas.

First, we may want to connect the graph of comparisons. Indeed, a poorly connected graph of pairwise video comparisons means that each highly connected component may have scores that are poorly calibrated with respect to other highly connected components of the graph.

Second, we may want to better anticipate the actual gain in querying the contributor for a specific comparison. In particular, what is the variance on the posteriors of the personalized and Tournesol scores of the rated videos?

To our knowledge, no one is currently researching these challenges.