Value Ranking
13 May 2026
I decided to push the personal ranking things I use to git. This is ranking for personal items, not web scale data.
My first encounter with ranking was elo from chess. I knew chess used elo, but I didn’t know too much about the algorithm. One of my previous employers had a ping pong table, so I created a website to keep track of the matches, so we could rank the players at the company. I think I deployed Elovation to Heroku. It wasn’t used since people did not want there games to be publically tracked. I learned about other ranking algorithms like TrueSkill created by Microsoft. I think League of Legends was also popular, so I was interested in how they did matchmaking. I later spoke to someone who worked in matchmaking on another game. Ideal matchmaking has you winning at 50%. People don’t like losing half the time, so they added bots to soak up some losses to increase win rate beyond 50%. So, elo is good enough for ranking.
Eventually, I wanted to rank items to help me decide what I should spend more time on. I have limited amount of time and infinite media to consume. I implemented something with ELO and Thompson Sampling from multi-arm bandits to help me decide what to do with my time. I want to rank items, but I only want to spend time ranking the most promising items. I don’t care to compare between two choices I don’t like. I want to compare two things that I do like. I find it hard to state what I like unless I forced with having to choose between two options. That worked well for me.
Eventually, LLM started making leaderboards important. These used Elo and then eventually the Bradley-Terry model. I likely heard of it before in the context of sports betting, which was another topic I looked into in the past. This time I wanted to implement the Bradley-Terry model for the fun of it, so I did. Now, I could compare the rankings between Elo and the BT model. I still like elo, since I can do enough evaluations to hone down on my exact preferences. There is an issue with the BT model if the win or loss rate is 100%. I added a self-match to help normalize it. I should probably do more literature review to see what poeple do, but I think in practice, there is enough noise in real data that win/loss rate is never 100%.
This leaves, where I am at today, with the value ranking repo I put up. I didn’t see it providing ulility to other people until know. Values. When I started my job search after the Insight Data Science fellowship, they empahsized it is important to work at a company aligned with your values. I just wanted a job, so I wouldn’t have to live on the streets. Values were some touchy-feely concept that only those that were priviledged could consider. Did I consider myself one of the privileged? Nope.
Now, my feelings are different. Values are very important that they get at the core of who you are. What are your values? If you don’t know, you can follow the instructions on the git repo and keep clicking about 200 times until you get an answer.