By Ewan MacLeod — 18 Jul 2025

Inside Datadog's AI Strategy with with Chief Scientist, Ameet Talwalkar

Today, I'm delighted to bring you an interview with Ameet Talwalkar, the Chief Scientist at Datadog.

Most readers will be familiar with Datadog (I've deployed it myself!), but I think it's worth making sure everyone's on the same page just in case you could use a primer: The company offers one of the leading observability and security platforms for cloud applications. Their SaaS platform integrates and automates infrastructure monitoring – that's everything from application performance monitoring, log management, UX, cloud security and so on – all in one place.

So, it's a no-brainer and if anything, it's one of those platforms that some of the most forward-thinking organisations already had in place prior to the recent AI 'Cambrian moment'.

I thought it would be absolutely fascinating to understand how the company thinks about AI and how its approach has been changing, especially in response to the latest trends (e.g. check out their LLM Observability system).

Who better to talk with than the company's Chief Scientist?

Ok, let's get started.

Over to Ameet - my questions are in bold:

Who are you, and what's your background?

My name is Ameet Talwalkar, I'm the Chief Scientist at Datadog, the observability and security platform for cloud applications. I lead Datadog AI Research, the company's new AI research lab, where we're currently working on new AI models, agents and other innovations to improve observability, cloud and security monitoring.

Before joining Datadog, I was a Venture Partner at Amplify Partners, and prior to that Co-Founder and Chief Scientist at Determined AI, where we created an open-source deep learning training platform that was acquired by Hewlett-Packard Enterprise in 2021.

I'm also an associate professor in the Machine Learning Department at Carnegie Mellon University. I hold a PhD in computer science with a focus on machine learning (ML).

I've always had a passion for AI/ML research, which led me to academia, as well as the practical application of new theories and technologies, which led to my professional pursuits. I'm fortunate that during my career I have been able to straddle both academia and industry.

What are your general responsibilities?

My main focus is on building out an AI research lab. It's a young lab and given the importance of attracting top-tier talent, I spend a lot of my time on recruiting, from sourcing to phone screens and in-person interviews. I am also responsible for setting and executing the technical vision of the lab. I closely collaborate with engineering and product teams to first identify moonshot research problems of strategic importance to Datadog and ultimately to achieve translational impact.

I also work directly with other members of our research team to tackle cutting edge research problems and share our results both internally and externally - in the form of research papers and open-source contributions.

Can you give us an overview of how you're using AI today?

Datadog's platform scans everything from metrics to logs and APM data, across cloud and IT infrastructure. We've amassed a huge amount of data that we believe are vital to inform AI models and train AI agents to improve observability. Consequently, we are taking a specialized approach to AI. The AI models and agents we're developing are tailored for our domain specific problems and trained on our data, while also being significantly smaller (and thus cheaper) than general-purpose LLMs like GPT-4 for instance.

For instance, we recently released Toto, which is the first open-source foundation model focused on observability, and BOOM, which is the largest public benchmark of observability metrics and captures the full scale of production telemetry issues. Toto demonstrates state of the art performance on BOOM, and also achieves strong performance on various time series forecasting benchmarks. Internally we're exploring the use of Toto for various forecasting and anomaly detection tasks, and externally we've seen tremendous interest in these open artifacts, including over 30K downloads of our Toto model on HuggingFace in the past month.

Tell us about your investment in AI? What's your approach?

Datadog is investing heavily in AI on various fronts, and in the specific context of our research lab, we are actively building out a team of leading AI researchers who share the vision of working on challenging AI problems directly grounded in real-world challenges that Datadog's customers face.

We're focusing on a small number of moonshot challenges, collaborating closely with product and engineering teams to both identify problems and facilitate translational impact, and making use of massive quantities of real observability data.

The combination of data, facilities, and a team of dedicated researchers has led to the development of our state of the art time series foundation model (Toto) and is currently driving our development of domain-specific AI agents. One of the most exciting aspects of my role and working in the lab is our ability to operate on a scale that's not really possible at either a startup or in an academic environment.

What prompted you to explore AI solutions? What specific problems were you trying to solve?

I have been an AI researcher for roughly 20 years, and the emergence of ChatGPT and general purpose LLMs a few years ago fundamentally transformed our field. It's truly astonishing what frontier models like GPT-4 are capable of, but they're general purpose tools.

I'm of the opinion that in order to solve challenging, domain-specific problems we'll need to strike a balance between relying on these general-purpose models and developing specialized tools. My own interests for the past few years have focused on developing specialized models and agents, and I'm ultimately driven by the goal of tying AI advances back to concrete downstream use cases.

In a seemingly one-off conversation in early 2024 with my friend and co-founder of Datadog, Alexis Lê-Quôc, we chatted about our perspectives on AI and noted a remarkable overlap in our interests and viewpoints. Fast forward a year, and I joined Datadog to build out an AI research lab!

Our team is currently working on a handful of ambitious research areas grounded in real-world challenges in cloud observability and security:

Observability Foundation Models for forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces, etc.).
Site Reliability Engineering (SRE) Agents to detect, diagnose, and resolve production incidents.
Production Code Repair Agents that leverage code, logs, and runtime data to identify and fix performance issues.
Datadog is well positioned to have an outsized impact and help us to move up the technology stack in terms of developing specialized AI solutions.

Who are the primary users of your AI systems, and what's your measurement of success? Have you encountered any unexpected use cases or benefits?

In the context of our research lab, we have internal and external "users." Internally, we aim to build models and agents that enable engineering and product teams to provide fundamentally new product capabilities and/or augment existing AI features. Externally, given that our lab is committed to an open-science model of research, we aim to meaningfully contribute new ideas and artifacts to the broader scientific research community.

More broadly at Datadog, our AI solutions are used by a variety of IT, cloud, DevOps and cybersecurity professionals working at every level of the technology stack. In addition to our AI Agents, designed to help DevOps team accelerate incident management and root cause analysis, we recently launched Bits AI Security Analyst, which autonomously investigates potential threats while helping teams mitigate risks more quickly and accurately. We also recently launched a suite of new LLM Observability capabilities to help businesses better understand, optimize and scale their AI investments.

What has been your biggest learning or pivot moment in your AI journey?

The field of ML/AI has experienced two seismic events in my 20 year career - the rise of Deep Learning in 2012 and of LLMs a decade later. Both of these events, and especially the recent advances with LLMs, have forced AI researchers (and really all technologists) to "adapt or die."

After the release of ChatGPT, I personally spent countless hours over several months fundamentally reassessing what problems to work on and how to define success/impact for myself. My introspection led me to shift focus to specialized AI, and ultimately to Datadog, in order to work on exciting research problems and crucially to be able to concretely measure impact.

Our lab's focus on developing specialized AI models and AI agents is motivated by concrete real-world problems. Key to this has been access to domain specific expertise and data that we can use to build accurate and reliable models, along with close interactions with engineering and product teams to facilitate and measure downstream impact.

How do you address ethical considerations and responsible AI use in your organisation?

This is an increasingly important topic, and while I can only scratch the surface with an answer here, I'll briefly highlight two key points.

First, we take data privacy extremely seriously. While our lab focuses on leveraging our massive collection of data to train new domain-specific models and agents, we don't use customer data in any of our open-source releases, but rather we rely on our own data (i.e., data collected by using Datadog to observe Datadog).

Second, an important implication of our focus on developing domain-specific AI models is that they are orders of magnitude smaller, and thus have a significantly lower energy footprint when deployed in production.

What skills or capabilities are you currently building in your team to prepare for the next phase of AI development?

The lab employs a team of AI researchers and engineers across a range of levels, from new PhD graduates to senior leaders. We can draw from a global pool of resources and tailor the team and expertise to suit specific projects and initiatives.

If you had a magic wand, what one thing would you change about current AI technology, regulation or adoption patterns?

I would like to change the narrative around the technology to be less hyperbolic and more pragmatic both in terms of its promise and risks. I think that more efforts on application-specific AI, like what we're doing at Datadog, could help with this.

What is your advice for other senior leaders evaluating their approach to using and implementing AI? What's one thing you wish you had known before starting your AI journey?

Focus on benchmarks! One of my mentors during my post-doc, the Turing Award-winning researcher David Patterson, would often note that "Benchmarks, for better or worse, define a field." He was largely drawing on his experience in computer architecture when making this comment, but it is equally true in AI.

Many of the major advances in AI over the last several decades have been empirically driven, grounded in strong computer vision and natural language processing benchmarks.

Moving forward, as organizations in all domains lean into AI, it is of fundamental importance to guide this process with strong, domain-specific benchmarks. These benchmarks can be hard to create, but without them, organizations will be flying blind!

What AI tools or platforms do you personally use beyond your professional use cases?

I'm pretty simple with my AI tool use, though increasingly reliant on it. My favorite AI tool is ChatGPT in voice mode. I'm amazed by its voice recognition capabilities - I can interact with it like I'm talking to another person! - and I view it as a jack-of-all-trades that can give me solid advice on almost any question I have.

What's the most impressive new AI product or service you've seen recently?

While there are a bunch of cool prototypes coming out daily, I see the true impact from AI so far as coming from only a small handful of verticals (search, chatbots, and coding). The tools in these verticals, e.g., ChatGPT, Perplexity, Claude, Cursor, are all truly amazing.

BUT, I see a surprising dearth of impact in other verticals, and this gap between hype and true impact is what is motivating my work on specialized AI and our efforts at Datadog.

Finally, let's talk predictions. What trends do you think are going to define the next 12-18 months in the AI technology sector, particularly for your industry?

Our field is moving so fast, and it's hard to even guess what will happen in 3 months, let alone in 12-18 months. With that said, here are some predictions:

Domain-specific benchmarking will become increasingly important, as both companies and academics increasingly focus on "AI for X."
We'll see a proliferation of specialized AI models trained on domain-specific data. These models will dominate domain-specific benchmarks while also being cheaper at inference time than frontier models. Relatedly, domain-specific data will increasingly be understood as the moat for developing domain-specific AI solutions.
The companies developing general-purpose frontier models will begin to target specific verticals and develop their own specialized approaches.
M&A will continue to increase as acquirers look for top-tier AI talent and startups struggle with product market fit (in part due to lack of data).

Ameet, thank you so much for taking the time to participate. That was absolutely brilliant.

Read more about Ameet on LinkedIn and find out more about Datadog at www.datadoghq.com.

OpenAI's Agent launched today