Taboola Talks Re-Architecting the Marketing Landscape with AI

TechArena Podcast hosted by Allyson Klein and Jeniece Wnorowski

Join TechArena host Allyson Klein and Jeniece Wnorowski of Solidigm as they chat with Ariel Pisetzky, Taboola Vice President of Information Technology, about its content recommendation network. Learn how Taboola is reshaping the marketing landscape with AI-infused customer engagement tools that offer relevant and engaging content for users across multiple website publishers.

Listen to the podcast on:

 

Audio Transcript

This transcript has been edited for clarity and conciseness

Narrator: Welcome to The TechArena, featuring authentic discussions between tech's leading innovators and our host, Allyson Klein. Now, let's step into the arena.

Allyson Klein: Welcome to The TechArena. We have another episode of Data Insights for you today, and that means I am with Jeniece Wnorowski, my co-host from Solidigm. Welcome to the program, Jeniece.

Jeniece Wnorowski: Thank you, Allyson, it's nice to be back.

Allyson: I am so excited for this episode. It is very rare that we get to talk about solutions that actually touch the spaces of marketing as practitioners. So I am excited to talk to our guest and hear about what he has to say about the world of data. Jeniece, do you want to introduce him and why we're talking to Ariel today?

Jeniece: Yeah, absolutely. I, too, am excited about being able to talk to somebody in our daily realm. But today, we're going to get a chance to talk to Taboola. And we're going to talk with Ariel Pisetzky, who is the VP of Information Technology in cyber. So we're not only going to get, you know, to take a look at just how marketing is done, but from a technical standpoint, which is really, really cool and interesting. And Ariel has a tremendous background in tech and I think does things a little bit differently. So it'll be interesting to hear what he has to say.

Allyson: Welcome to the show, Ariel. It's nice to have you here.

Ariel: Oh, it's my pleasure. Thank you for hosting me.

Allyson: So Ariel, Taboola provides a place for advertisers to turn users into paying customers, which is such an interesting tool. I've used it in organizations that I've managed in the past. You work at scale with over 4 billion pages per day, which is amazing. Can you share a bit about your solutions and their engagement with customers in the market?

Ariel: Yes, thank you so much. So Taboola is that I'd say content recommendation engine or a content recommendation network where we are visible on multiple different publishers and we provide those pointers to additional content that is relevant and engaging for users at any given moment. The idea being that when you are reading an article or you are engaged with whatever publisher you enjoy reading and you still have a few more minutes, you would like to read maybe something else. You would like to engage with additional content. We provide those additional recommendations, some of them being sponsored content, some of them organic content from the publisher website itself. And those engagements are there to really enrich your browsing experience. Think of the days years ago when a web page would actually, it would reach the end of a web page. Today with our product, the web page continues with additional content recommendations. So you can think of it as search in reverse. Instead of you looking for content that might be interesting for you, the content is seeking you out within your reading session.

Jeniece: That is super interesting, Ariel. Following up on that, can you tell us a little bit about how you're looking across organic and paid content? And why is that important?

Ariel: Sure thing. So we are an advertising platform. That means that some of the placements that you will interact with on a publisher website are paid content. They are placements, advertising kind of placements, that are placed there to create an engagement for you with content that is relevant, but also creates revenue and monetization for the publisher. So if we are used to the marketing model, where it's all about banners, and people really tend to ignore banners these days, this is a different real estate that exists on the publisher's website that now creates monetization options for that publisher. So you as a user get to use the publisher services for free, and you create revenue for that publisher by interacting with additional content that is relevant for you.

So the paid content is what really generates the revenue, and the organic content is where you also get an additional experience and interaction with said publisher. So if you're on a website and you just finished reading or reached a point where you're not interested anymore in that article, you want to move on instead of just moving on randomly or as I said earlier, searching for other content, you will get recommendations for content that is relevant for you, and you can continue engaging with the publisher's website. So the publisher gets to provide you as a user with better user experience, the publisher gets an additional chance to monetize on traffic, and the advertiser gets a chance to tell their story and provide a value, a valuable product for you to interact with while you are surfing the web.

Jeniece: I know you're looking across organic and paid as we just talked about that, but all of this is a huge data crunching problem, right? And it requires incredible real time insight. Can you share a little bit more about how you're tapping the tech stack on prem and in the cloud and everything from edge to cloud?

Ariel: Yes, so that's a good segue into actually how the mechanics work. So as said earlier, we are providing our services for upward of 4 billion web pages a day. And that means that we provide a lot of recommendations. So a page is not built out of one recommendation, it's built out of multiple recommendations. And those recommendations need to be relevant and specific. They need to be personalized. The content recommendation that you get, be it paid or be it organic, needs to be relevant for you, Jeniece, or for you, Allyson. It will be different for each and every one of you without Taboola actually ever knowing your identity. So I'd say in contrary distinction to other providers out there, or other ad platforms out there, that you provide a whole lot of information. You may be provide your name, your political affiliation, your age, pictures, and other tidbits of information. Into our platform, you really do not provide any information. We do not identify Joe from Jane or anyone else.

And we would like to provide the most relevant content recommendation with this anonymity going on. That means that we need to sift through a lot of data. Four billion web pages, that's approximately 40 billion web recommendations. That is a whole lot of clicks, that's a whole lot of log data, that's a whole lot of information that we need to train our model on constantly to make sure that we provide the best personalized experience for each and every user every time they interact with the publisher and our product and provide the right content recommendation at a given moment. For example, when you have a web page, and that web page is talking about any topic that you can think of, that topic might be positive, might be negative, might be political, might be sports related, might be parenting advice, anything. We as Taboola need to understand the context of that web page, the context of the general session you are now experiencing, like the journey you have within the publisher, to provide you with the most relevant piece of content. People many times think that if you're on a, let's say, fashion website, or on a fashion web page within a larger publisher that has multiple sections, then oh, sure, just throw in their additional fashion articles. That's what this person is actually interested in. But that really is not the case.

You might be interested in reading one or two articles about fashion, but now you're interested in economics, or now you're interested in science, or now you're interested in some other type of content, because we are not only two-dimensional that we read content, and only one type of content interests us, and that's it. We really are more complex. So what we need to do is sift through all of this data, train our model constantly, continuously in real time, do it very fast so it is also evolving with the speed of news as that news comes across the internet, and provide the relevant recommendations. There might be huge news events, there might be other events taking place, and publishers would like to have relevant advertising and organic content near that article, be it what it is. So that's kind of the top-level thought of the amount of data. As for on-prem and cloud, we operate mostly on-prem. We have a cloud presence, but that cloud presence is there for relevant services only. And the idea of being on-prem is really managing costs, owning the full tech stack, and having the ability to go deep and utilize every type of technology that we have to the fullest. So if we're talking about storage, or if we're talking about CPUs, or if we're talking about networking, each one of these components has its own optimization path and has its own utilization path. Think of a cloud instance, of a database, let's say. How many cloud instances do you have out there with reasonable pricing that you can get 40, 50, 100,000 IOPS on within a given moment, continuously, and then month over month? And if you need to have that at scale, and with 4 billion web pages, this is exactly the scale problem, you need those IOPS. And you need to really control the stack from end to end, to control the costs, and to provide the performance that my R&D engineering departments require, and my users rely on to get the personalized content within the timeframe of a web search.

Jeniece: So let's take it back a little bit further and talk about how you use AI to provide context to your recommendations.

Ariel: Yes, that's a great question. There are multiple types of AI, from generative AI that everyone's talking about recently, to more specific AI types that we have been running in Taboola for the past five or six years. And looking back, it all started from the need to understand language. What is happening on a given web page? So that is natural language processing, NLP. And that's, I would call it today already, especially with LLMs, the basic layer of our AI stack. From there, it moves on to training of models of what is actually interesting for the different users at any given moment per publisher. Then there is real-time inferencing, which means that we look at every request coming in. And we might have thousands, tens of thousands, or even more items to provide you, Jeniece, or you, Allyson, with any given kind of web page. But we need to infer what is the most relevant piece of content for you at this specific given moment. So now there is inferencing that is happening in real-time. Now, let's put training aside for a moment, because training is like a big process running on multiple grid computing solutions with a whole lot of storage.

And let's just think about the inferencing. The inferencing is a real-time problem. It has time allocated to it. It needs to start and finish within milliseconds so that your web surfing experience is continuous and you get the content recommendation as soon as you actually load the page. Now, to do that, it means that we need to get the information from the training models, but then also pull the data for your specific session per inferencing request and sift through the optional recommendations that could be there for you and match that with the other signals that we have and provide you with the best content recommendation at any given time. So those are, I'd say, the large scale items of training and inferencing that happen within Taboola. And then, of course, there is the generative AI where we provide images and we provide ideas for subject lines on different content that is out there. So again, we need to look at a whole lot of data that we have in regards to what actually works, what creates a good CTR, click-through rate, a good CTR metric for a given publisher, and then recommend that to the publisher or to the advertiser and provide some type of assistance for the users of our system so that they can provide the best ad campaign possible and receive the best return on advertising spend.

Allyson: Now, you have described something that scrapes a tremendous amount of data, and the AI algorithm training obviously has some serious compute requirements behind it. Can you take us out of the hood with your tech stack and talk about how you've had to change your infrastructure in your private cloud to drive the performance needed to deliver this kind of insight in real time?

Ariel: Yes. Over time, what we saw was that when we think of our private cloud in the terms of storage, compute, and networking, we can really optimize each layer and rely on the different layers to continue and optimize each other. I'll give a fun example. When you are in a private cloud and you control those three vectors and you have data within drives, many times one of the most basic functions you can think of to save storage is to compress the data. Compressing the data costs in CPU time. It saves on storage, but it costs on CPU time. It also saves on networking. If your CPU is optimized and your networking is optimized, maybe you don't want to actually compress the data. I mean, yes, you might be spending more on storage, but you will be gaining so much in terms of performance and time or you will be gaining compute power that you do not need now to buy. So thinking of that for a moment, I can rely on my storage provider, in this case, of course, Solidigm, to provide me with reliable, fast-speed read throughput that is fast enough so that I can read the data uncompressed. I can send it over my network uncompressed. And therefore I can save on compute on CPUs and use the compute time to do better and more in-depth inferencing, which is actually where my business is. So controlling my full stack and understanding my business needs pushes the IT to adjust for the business and not the business adjusting for IT. And those kinds of lines of thought is what brought us to look at local storage and to look at storage solutions that we control the storage from really the physical drives themselves, the geometry on the drives, and everything to do with a drive.  And up to the software running the said storage solution or the direct attached storage, be it internal drives on a server or anything else around those lines.

Allyson: Now, I know that this is a topic that is becoming something more important across IT organizations, which is the sustainability of the infrastructure when it comes to AI. How does Taboola approach efficiency?

Ariel: Oh, wonderful. Yes, sustainability is important not only for our children, but also for IT itself. If we are not sustainable, it means that we will not be able to grow. And the best example I can give here is really power usage and cooling, of course. When you look at data center space, in many places data center space is running out. Power for said data centers is running low. And cooling, which uses a lot of the power, is actually a problem because you need to cool bigger, hotter CPUs or GPUs. You need to cool additional components. And you're spending a lot of your total data center power budget on cooling instead of the IT itself. So when we come to look at our tech stack, and this be on storage, on compute, or on networking, we try to see what are the actual power gains we get from the different hardware providers and from the different models within a specific provider that we select. Do we get better performance for the same power output? Because many times you might say, “Oh, this server is way stronger than another server.” But you're going, “Yeah, but it also consumes double the amount of power.” If it's giving you double the amount of output, then you're actually not doing anything here in terms of sustainability. You're not actually improving in terms of efficiency. You're just running a hotter server, paying for probably additional components, and then cooling a hotter server and so on and so forth.

So we look at everything from memory to storage units, to the drives themselves, to see what is their power footprint, what is their cooling footprint, how do we actually cool them, how much fans and fan speed is necessary, how hot can we run them, how efficiently they still perform under harsher conditions. Can we increase the temperature within the data“ center? Can we be more lenient on the actual cooling needs so that we can save on power for just the sake of cooling instead of having penguins surf the data center, this extremely cold data center that really doesn't help the servers, but just feels right for some people. So that is just one facet of sustainability. Of course, there is also the issue of how long any piece of equipment is installed and can still work within a data center. The longer a hardware is usable and efficient within a data center, the less e-waste is generated, and again, the better sustainability footprint you have. And this also goes, of course, into the provider's commitment into shipping the hardware in an efficient way with the least amount of packaging and the most recyclable materials that are possible. So that's just a few words of sustainability, and we can, of course, talk about that more if you're interested.

Jeniece: Yeah, I know that we should definitely touch on this a little bit more, but one thing I want to ask is, as the world is moving to GPUs, and that being coming the de facto large compute node standard, how does this impact the data center and storage solutions in place?

Ariel: So I'll start with sustainability on that and continue into storage. GPUs are an interesting solution to actually provide more or additional compute answers within the same power allotment. So for us in Taboola, we, for example, are using GPUs to run data crunching in Spark jobs specifically. And that means that we now can use a single server instead of approximately five or six previous generation servers. So a single server, or even better than a single server, a single GPU now takes the place of five other servers. So that's the sustainability part of power and footprint. Now the storage part of this, of course, means that every node is much hungrier for data, because each node now represents five traditional nodes. So I have this single GPU, and in a single server I might have four GPUs, just as an example, [which] are now actually consuming data at the pace of 20 former generation servers. So I need to feed these GPUs way faster, both on the storage side and on the network side. So on the storage side, if the storage is local, or on the storage and network side, if the storage is remote, then I need to split the storage between immediate local scratch space and central storage where all of the data is written and kind of balance the needs between those. Both in any case can and should be optimized for the entrance of GPUs into the world of compute and into the world of anything from data crunching to generative AI to training models, all using GPUs and all of course consuming a whole lot of data.

Allyson: Now, I know that you have a relationship with Solidigm. Can you talk a little bit about how you look at SSDs as part of that tech stack and why the right storage is so critical to this environment?

Ariel: Yes. When you're running a private cloud, when you are the cloud for your customers, you want to make sure that you have availability, you have sustainability, and a good relevant SLA with the lowest total cost of ownership possible. That means that components, once installed, you want to visit them the least amount of times as possible.

And when I talk about visitation of physical components, I, of course, mean techs on site. So people that come to the data center to fix something or replace something. And what we have found over the years is that the reliability levels within the realm of Solidigm are much higher than other hardware providers that we have been using. And the working relationship with Solidigm has brought us to a place where we install the different drives in the different locations and we kind of forget about them. Of course, when I talk about forget, I mean, we never visit them again. We don't need techs to reach them because they so rarely give us any problems. So the idea of choosing your hardware provider really comes down to the consideration, over time, what components would you like to see in your data center? How long do you want to use those components? How little you can fix them, replace them, and move them around so that you get the lowest total cost of ownership over time. So it's not just the purchasing price, it's the actual operating price of said hardware over time. There is also the management of that hardware, the firmware and the software coming with that hardware, so you can get statistics and monitor and good observability into what the drives are doing, where are they running with extremely high IOPS, or maybe you can optimize what are the kind of queue sizes for the IO stats of the different systems and so on and so on. So there's always this continuous process of evaluating, do we still want to use a specific component? Is this component working for us? And with Solidigm, it has been working for us and is working for us continuously.

Jeniece: Well, Ariel, gosh, thank you so very much. Your expertise in this space just never ceases to blow my mind and the amount of knowledge you have even shared here today with folks. We can't thank you enough. But I know folks are going to want to know and learn about more of the solutions that you talked about today. Where can they go to learn and engage with you and your team?

Ariel: Oh, that's a wonderful notion. Yes, we are more than happy to engage with other like-minded techies out there. You can find us on LinkedIn, and we have multiple articles on LinkedIn, and within our engineering blog on the Taboola website. So those, I say, are the best sources of information where we talk about data center management, total cost of ownership, management software, managing large scale IT operations, and so much more around those topics.

Allyson: Thank you so much, Ariel, for being on the program. I love this conversation, and I know that our audience will too.

Ariel: Thank you very much.

Allyson: And Jeniece, that wraps another episode of Data Insights. I can't wait to see where we go next.

Jeniece: Amazing. Thank you, Allyson. Thank you, Ariel.

Narrator: Thanks for joining The TechArena. Subscribe and engage at our website, thetecharena.net.

All content is copyrighted by The TechArena.

Used with permission.