Creation of an ML system that automatically detects what product is advertised in an online advertisement. The challenge was a huge classification problem with tens of thousands of possible brands and categories.
Gemius is an international research and technology company that provides both media consumption research and tools to optimize advertising campaigns and ad serving.
Gemius has been awarded five times in the prestigious European IAB Research Awards competition in the Audience Measurement category.
We have created an efficient system that classifies ads in seven different languages from seven different countries.
Our solution has replaced much of the work previously done by humans. Moreover, we have designed the system so that adding support for new languages is very easy.
Our system works in two steps:
1. finds a set of good candidate brands for an ad,
2. for each couple – ad and candidate – it predicts whether it is an appropriate match.
This approach allowed us to scale our solution to millions of ads and tens of thousands of brands.
Candidate brands can be found by combining various techniques – extracting text from images, key phrases from ad descriptions, website analysis and logo detection. We identified suitable candidates using state-of-the-art algorithms – XGBoost, LightGBM, deep neural networks, transfer learning and factorization machines. Our solution was presented at the PyData Warsaw 2018 conference.
MIM Solutions is a company originated in University of Warsaw (UW) Algorithms Group, directed by prof. Piotr Sankowski. The company gathered experts interested in solving practical algorithmic problems efficiently, which finally evolved towards machine learning. Although the MIM Solutions is a company which is not a part of UW, these two entities are still in tight cooperation.
MIM Solutions specializes in hard tasks. We are proficient in providing effective solutions, especially when standard methods have failed. However, for the most common problems we specialize in, we offer a set of generic services ready to deploy on any environment in no time.