Creating a comic book recommendation system for comic recommendations engine non
Creating a comic book recommendation system for comic recommendations engine non
My approach was to build a Alternative Least Squares model to have a collaborative filtering recommender system. What you may be asking yourself is what are these least squares and why in the hell are they alternating? With this set of ids, I found any corresponding reviews. I then took the reviewers in that set and found any that also had reviewed movies/tv. After removing items with less than 5 reviews, and dropping any data missing relevant metadata , I was working with ~84,000 reviews, with ~8,500 distinct users and ~7,400 items . So I have my ratings matrix with users and their reviews for comics and movies . This matrix is going to be fairly sparse, as users will have only rated some things, but we want to fill in those empty slots and guess what they might score based on what we do know about them. We first choose a number of latent features. The latent features are essentially a mathematically derived way of describing users and items. Through non-negative matrix factorization , we can derive from R the matrices for users and items . We hold U constant and solve for P. Then, we hold P constant and solve for U. This process goes back and forth until there is no reduction in error seen. Once complete, U and P are multiplied together to get our new R estimate, filling in values across the user review matrix. Creating a comic book recommendation system for comic recommendations engine non
Creating a comic book recommendation system for comic recommendations engine non
Follow 584K Followers · Editors' Picks Features Deep Dives Grow Contribute About Get started Open in app Creating a comic book recommendation system for non-comic readers John Naujoks
Alternating Least Squares is a method by which we can decompose a ratings matrix into a matrices representing users and items, with each expressed through latent features. In a very simplistic example, it could be thought of as such: My main thought was, how can I get people who watch movies and tv shows about comic books into comic books? How about with their preferences in movies and tv shows! How about I recommend you comics from those existing preferences. As much as I love the medium, the c o mic book industry at large is often looked over in favor of the next big movie or tv show using their stories or characters. There are many people who see these shows and movies and are inspired to read, but don’t know where to start. I get it, jumping into comics can be intimidating. I think many people get that vision in their head of the comic shop owner on The Simpsons and want to run the other way. I thought there has to be a way to help people get a foot into comics without starting from scratch. A passion of mine is comic books. As you may have noticed, even if you are not a reader, comic recommendations engine you are bound to be exposed to characters and stories that originated in comics being used in television and movies. Netflix, Hulu, Amazon, HBO and many others all have dipped into making content from comics. Summer theaters are bustling with the latest Avengers or Batman tale. With all this in mind, let’s build a recommendation system! You can test it for yourself at IntoComics . The application gives you a selection of movies to rate from 1 to 5 . Once you have rated one or more, it returns a listing of the top 5 comic book/ graphic novel recommendations with links to each product on Amazon. I’m going to continue to work on improving the model, but for now, hope you enjoy! My goal was to look for people who had rated both movies/tv and comic books/graphic novels. To build my model, I used a large repository of Amazon reviews previously collected in a research project at the University of California, San Diego . In general, this project relies on the idea that there is some intrinsic taste to the content you enjoy. Below is an example of the type of assumption. It may not always be a clearly labeled concept, but there is potentially some underlying aspect to your tastes that can be seen across media you consume. Comics Machine Learning Recommendation System Recommendations More from Towards Data Science Follow Your home for data science. A Medium publication sharing concepts, ideas and codes. In order to extract the correct users and ratings, I had to spend a good amount of time learning and exploring this dataset. All of the comic books/graphic novels are lumped within all other book reviews with no shortcuts to pull them out. I started with a smaller amount of Amazon IDs for comic books from scraping Amazons bestseller pages. Based on data exploration, I found a pattern in the ids to shortcut getting a few large chunks of ids for comic books. Below is an example of the patterns I found: For evaluation, I optimized the performance of my model based on optimizing for Root Mean Squared Error and Mean Absolute Error . Tuning the parameters of my model did not prove to offer too much in terms of performance, but I settled on my best model using 50 latent features, regularization parameter at .1, and max number of iterations at 20. My best performing model has an RMSE of 1.17. In general, I would hope to get that number under 1, but think this is a fairly good performance under my current scope. which marvel character has the most comics