Take a look under the hood of data science at Tinyclues
Artificial intelligence or AI has made its way into just about every industry. Itâs even helping marketers deliver better customer experiences and higher revenue for their companies. At Tinyclues, our data science team has built and continues to evolve AI designed for CRM marketers and powered by data science. In fact, theyâve been doing it for over 10 years.
Please take a closer look at the team behind the industryâs highest-performing AI-based audience building solution. In this conversation with Artem Kozhevnikov, Head of Data Science at Tinyclues, discover more about AI and what drives our data science team to continue to innovate, work together, and deliver AI solutions for customer marketers.
As the head of Data Science for Tinyclues, could you tell us a little bit about yourself and your team?
I was Tinycluesâ first employee! I had just wrapped up my masters at Polytechnique in Paris and was getting started on my PhD when I met the founder of Tinyclues, David Bessis. I actually knew next to nothing about AI then. But I was inspired by the Tinycluesâ vision and decided to join him in the adventure. It was a busy time in my life juggling my new role with writing my thesis in pure mathematics / sub-riemannian geometry.
Fast forward to today⊠Tinyclues has grown a lot and my team as well; weâre about 15 now, with a breadth of expertise including: Data Scientists, Data Engineers, and Machine Learning Engineers.
Whatâs always been important to me is to stay true to our scientific integrity. Trust measurement and results when we can, and rely on mathematical intuition. Rationality should be our north star in order to measure success as well as to guide relationships with our clients. Data science is not magic after all. It is a science!
How would you describe your mission?
My own mission today is mostly about defining our technical vision. Finding the best way for us to build and evolve full-stack predictive modeling to solve our clientsâ problems.
This vision helps guide core decision-making. Such as what skill sets we need on the team and what technology we need to invest in.
A big part of my role is also about sharing the vision and focusing on achievable goals to lead Tinyclues and the R&D team on the best path possible.
Practically, this means collaborating with the other teams to discuss and draft roadmaps, compare the technical potential to actual business needs, and decide what makes sense to provide our clients with the best solution possible.
More generally speaking, my teamâs mission is the creation, maintenance, and improvement of machine learning models that underpin our product. Weâre the guardians of the quality of Tincyluesâ predictive capabilities.
This means that we constantly watch over the performance of each of our clientsâ campaigns, to make sure they always get the best results possible. Most of the time we know their dataset better than they do.
Last but not least, weâre constantly innovating and preparing for the future. We want each new client setup to go even better, faster, and smoother.
What do you and your team like about your job? What gets up in the morning?
A lot of it has to do with challenges.
Thereâs the technical challenge of scalability. Specifically making sure our solution performs consistently well across all our clientsâ campaigns when their objectives are so different. As well as figuring out how we effectively scale âbig data.â
Thereâs the suspense that comes with each new client dataset. Theyâre all slightly different so there is always something to learn. Thereâs always an adjustment to make in order to ensure the campaigns perform well. After that has been accomplished the adjustment becomes a native part of our toolbox.
Thereâs also the scientific challenge as we maintain a very high bar. I expect the whole team to read research papers and attend relevant conferences. We need to stay on top of new ideas and breakthroughs in the field. Itâs really important for us to take the time to be curious and smart about what we do, and never implement a solution without really understanding how it works. That would just be a disaster waiting to happen!
Also, thereâs maintaining the team culture that I think everybody appreciates here.
Everybody can challenge someone elseâs work so that we keep learning from each other.
Second, and I think thatâs different from most Data Science teams, each of the team members are responsible for their work end-to-end. When they design an improvement for a model, theyâre working on it from the idea stage all the way to the release. This helps keep practical considerations in mind, and itâs also very gratifying.
The feeling you get when you see an idea or project you were working on actually come to life is the best.
Last but not least, the human factor is very important. Data science is hard, and sometimes you spend days on one problem. Being part of a supportive team helps a lot.
How would you define Tinyclues?
Tinyclues is a solution that helps CRM teams orchestrate and target their campaigns.
The value for CRM teams who use Tinyclues is huge: they can easily make data-driven decisions that have a real impact on important metrics (revenue and customer experience) while saving time on more tedious tasks. With our solution, their day-to-day role becomes more efficient and they can focus on the more interesting elements of their role. Additionally, they gain more influence within their organization.
From a Data Science perspective, we solve the challenge of forecasting: The core question our models answer is this: given any offer within our clientâs product catalog, what is the probability that a specific customer will buy it within the next couple weeks? The output is the propensity to buy this offer. This allows CRM marketers to focus campaign efforts on customers with the highest probabilities to convert.
And by offer, we mean any subset within a clientâs catalog as defined by product_id, SKU, in addition to brand, category, or combination of the above. We can also work with clients on any custom breakout they have within their own catalog hierarchy.
We can also optimize specifically by each channel dimension by identifying the propensity to buy a specific offer when itâs communicated to a customer via email vs. sms vs. direct mail etc.
Behind the scene we represent this as a combination of 2 models, mainly:
- For a given customer, what is the probability that they engage on this channel and make a purchase?
- If they do make a purchase, what would the probability be of buying that given offer?
The last type of models may look like recommendation systems, but we actually solve the opposite problem where we interchange the role of “users” and “items”:
- Recommendation = given customer X, what are the products most likely to be bought?
- Tinyclues = given offer Y, who are the customers most likely to buy?
While thereâs a lot of data science literature around recommendation systems that we can leverage, weâve had to innovate in other areas. For instance, modeling deep long-term user intent (which is core to our models) is much less explored!
To detect intent, we take into account a deep history of user events sometimes up to years of historical data! Each customer has a âtimelineâ of events (a.k.a. their lifecycle) where each event (click, purchase, browsing) itself has dozens of attributes (product, date, price, etc). The data can be sparse and irregularly distributed. Even a very experienced Data Scientist can get lost in this ocean of combinations!
Why is Artificial Intelligence the best way to predict propensity to buy?
Well, the working alternative is to use some set of business rules (or âheuristicsâ); for example: âwomen are more likely than men to buy handbagsâ or âA handbag purchase in the last 6 months makes you more likely to purchase a handbag again.â
As a rule of thumb, if there are more than 10 heuristics involved, a machine learning solution will be more efficient.
Thatâs the case in CRM marketing! Just imagine, the combination of customer attributes x the number of offers x parameters of user events over a time period. Thatâs way too many data points, or possible heuristics, for a human to handle.
Another way to look at it is that machine learning replaces human intuitions through data-driven predictions. In practice ML approaches show greater performance which is why we are seeing them used in more situations regardless of the time and complexity to make them work effectively.
Weâve talked about product-related datasets, but what about travel and hospitality?
Weâve actually designed our model to work for all CRM datasets with a large customer base and a large set of offers. Our model is not retail-centric in the sense that it doesnât have a concept of product. Instead, itâs based on the concept of user events. The model predicts the probability of an event happening based on all previous events. An event can be the purchase of a product defined by a product ID, or the purchase of a flight ticket defined by a date, an origin and a destination.
We define the events used for optimization during set up. For each client, the feature engineering (structuration of the data before it starts feeding the model) will be different depending on their data structure.
Weâve seen a lot of success with this approach in all verticals: Ecommerce, retail, travel, hospitality etc.
Weâve been using the term Artificial Intelligence a lot, and itâs actually a tired buzzword. Do you think that there is any such thing as real AI?
I actually donât like using the term Artificial Intelligence. Intelligence is solving new problems in non standard ways. Thatâs not what our models do. Our models are just really good at answering very specific questions (thatâs called Weak AI). They are way more powerful than humans in these specific areas, and modeling propensity to buy is one of them. Itâs the same for all so-called âIntelligentâ models today. Theyâre powerful at playing Go, classifying images, as well as many other tasks, but I avoid calling this Intelligence.
However, there is some promising research in the domain of âMeta Learning.â These are models that learn how to solve new problems (tasks). Itâs still early but Iâm curious to see if there will be breakthroughs in our generation!
Can you tell us about the data science technical stack at Tinyclues?
When I started at Tinyclues there were not many machine learning frameworks available, and not much useful research in our field. So we had to start by building most of our stack in-house.
The space is constantly evolving. Over the last few years, tech giants have built powerful frameworks – building blocks for Deep Learning if you will. We followed their advancements closely in order to gauge whether or not we should migrate our stack.
Two years ago we decided to make the switch and use the Google Cloud Platform (GCP) ecosystem for our predictive engine along with TensorFlow as a Deep Learning framework. Weâre now one of the most advanced users of the GCP Vertex.AI platform!
Can you describe an “Eureka” moment / breakthrough you’ve had?
An interesting one was 3 or 4 years ago, during our Data Science Offsite.
Until then we had been using our own in-house version of deep learning framework. And we were wondering whether to switch to a modern framework like Tensorflow or Keras.
Back then, using these frameworks was complicated and costly in terms of time and capital at that time, so we had to be really sure they were going to work for our specific applications before switching. And we just werenât. So we didnât know how to move forward.
And while all of the senior data scientists were skeptical and heavily influenced by the way âwe had always done things,â itâs actually our intern who was able to see the problem in a different light. She created a small, simple model that proved that it would work!
This success was inspiring to me and I spent the next three weeks developing our actual model within these new frameworks, and we realized it was a huge technical breakthrough in terms of performance. I was able to align the roadmap and the vision to this new stack.
I guess the lesson here is listen to the juniors on your team! They can see the future sometimes.
What are the next big milestones for the team?
Now we feel our main predictive engine is pretty much where we want it to be so weâre going to start adding new models that will address new use cases. The goal is to solve more of our clients problems in a way thatâs constantly more efficient!
And of course, we canât wait to get the whole team together to celebrate our achievements as soon as we can safely do so.