ConceptMay 1, 2026

How AI Works for Marketers: Understanding Machine Learning Behind Marketing Tools

Name: Never Always, Never Never
Author: Patrick Gilbert
ISBN: 979-8-9951655-0-7

Quick Answer: how AI works marketing

AI in marketing combines four learning approaches: Connectionists use neural networks to recognize patterns in data like Google search algorithms. Bayesians update predictions with new evidence, powering automated bidding systems. Symbolists use logical rules, while Analogizers find similar patterns like Meta's Lookalike Audiences. Modern AI uses probabilistic computing, generating likely responses rather than exact answers. Understanding these fundamentals helps marketers work effectively with AI tools rather than expecting deterministic precision from systems designed to learn and adapt over time.

Definition

AI for marketing combines different machine learning approaches (neural networks, statistical updating, logical rules, and pattern matching) to automate decisions, optimize campaigns, and predict customer behavior through probabilistic rather than deterministic computing.

The Four Tribes of Machine Learning

As Patrick Gilbert argues in Never Always, Never Never, understanding how AI works starts with recognizing that not all machine learning operates the same way. Pedro Domingos identified four distinct "tribes" of machine learning in his 2015 book The Master Algorithm, each with different philosophies about how machines should learn. Modern marketing AI combines elements from all four approaches, making it essential for marketers to understand these fundamentals.

Connectionists learn like a brain. Neural networks strengthen connections based on feedback, adjusting millions of internal settings to recognize patterns in data. This powers Google's search results, Meta's news feed, and most image recognition systems. The key insight is that the machine doesn't need explicit rules. Instead, it discovers patterns from data through a process called backpropagation, where the network adjusts its internal weights based on whether predictions were right or wrong.

Bayesians learn like statisticians. They start with a belief about how likely something is, then update that belief as new evidence arrives. A classic example involves predicting whether the sun will rise tomorrow. Starting with no data, you might assume 50/50 odds. After one sunrise, you update. After thousands of consecutive sunrises, your confidence approaches certainty. This foundation powers much ad platform AI, where Google and Meta's automated bidding systems use Bayesian principles to get smarter over time.

Every conversion you track feeds back into Bayesian systems, reinforcing or adjusting hypotheses about which users are likely to convert. More conversion data means stronger predictions.

Symbolists learn like philosophers. They deduce rules from examples using logic. If Socrates is human, and all humans are mortal, then Socrates is mortal. Once the machine learns that rule, it can apply it elsewhere. Decision trees fall into this category. Unlike neural networks, symbolist systems excel at explaining their reasoning, but they need clean, structured inputs to work effectively.

Analogizers learn like lazy students. They find the closest match to a new problem and assume the answer is the same. Meta's Lookalike Audiences exemplify analogizer logic. You provide a list of your best customers, and the system finds other users who "look like" those customers across hundreds of behavioral and demographic signals. It's not deducing rules or updating probabilities. It's finding the nearest match and betting the pattern holds.

How Modern AI Combines All Four Approaches

In 2017, Google researchers published "Attention Is All You Need," laying the foundation for the Transformer architecture that powers modern large language models. This breakthrough combined multiple learning approaches in ways that seemed impossible when Domingos was speculating about a unified "master algorithm."

The Transformer architecture uses connectionist neural networks as its foundation, learning through backpropagation and storing patterns in billions of connection weights. It applies analogist principles to understand context through self-attention mechanisms that ask, "What other parts of this input are most relevant to understanding this particular word?" And it relies on Bayesian probability to generate responses, calculating the likelihood of the next word in a sequence.

The underlying technologies have existed for decades. It took until 2017 for engineers to creatively determine how to combine these approaches in a way that achieved something Domingos had speculated was nearly impossible just two years earlier.
Patrick Gilbert, Never Always, Never Never

Deterministic vs. Probabilistic Computing in Marketing

Understanding the difference between deterministic and probabilistic computing is crucial for working effectively with AI marketing tools. Deterministic systems follow strict rules and deliver precise outcomes every time. A calculator is deterministic: 2+2 always equals 4. When you click "Add to Cart," the JavaScript follows exact steps to update your total.

Probabilistic systems work differently. Instead of following fixed rules, they use patterns to generate likely responses without knowing whether those responses are correct. Most modern AI falls into this category, and occasional mistakes are a feature, not a bug. As Sam Tomlinson explains, "LLMs generate responses using probabilistic models, not deterministic models. These output a 'correct' or 'right' answer only in a probabilistic sense, not a binary sense."

The goal isn't to delegate entirely to AI systems. The goal is to work alongside them, understanding that probabilistic tools will sometimes produce unexpected results as part of their learning process.

Why Ad Platforms Show Strange Results

Many marketers abandon AI-driven campaign settings after seeing what look like "kooky" results. Broad Match targeting might show ads for seemingly irrelevant search terms. Your first instinct might be to conclude the system is broken. But those odd results don't necessarily mean failure. They could be part of the learning process.

The success of AI-driven advertising doesn't hinge on every individual search term being relevant. It hinges on whether the overall system delivers conversions at the efficiency you need. A few strange queries shouldn't cause panic. The system is designed to learn over time, identifying which auctions drive quality traffic and which don't.

AI advertising systems are probabilistic by design, not deterministic
Odd search terms or targeting choices may be part of the learning process
Overall performance matters more than individual query relevance
Platforms have reduced data access partly because advertisers misinterpret probabilistic results
Success comes from partnership with AI systems, not perfect control over them

The Shift from Deterministic to Probabilistic Advertising

Meta's evolution illustrates this industry-wide shift clearly. In the early days, Meta's advertising platform was largely deterministic. You could target users based on explicit data points with high accuracy. The system tracked behavior across websites and apps, feeding precise data back into algorithms.

Then came GDPR and Apple's App Tracking Transparency. Meta lost access to much of the granular user data it relied on. Deterministic models depend on consistent, reliable data, and that data suddenly became unavailable. Meta pivoted toward probabilistic models that make educated guesses about user behavior based on patterns and inferences.

Pulling content from anywhere on the Facebook or Instagram networks, though, is a fundamentally different problem, that requires fundamentally different approaches; these approaches are probabilistic in nature and built on machine learning.
Ben Thompson, Stratechery

For advertisers, this transition has tradeoffs. Probabilistic models allow Meta to maintain ad performance even with reduced access to individual-level data. But it also means less visibility and control over how ads are served and measured. Marketers who understand how to work within probabilistic systems will outperform those expecting deterministic precision from tools not designed to provide it.

Data Quality and Learning Systems

Analogizer systems like Lookalike Audiences are particularly susceptible to data quality issues. The learner will only be as good as the data you feed it. If your conversion data includes problematic patterns, the AI will amplify those patterns.

Consider a retailer of women's dresses where 70% of orders result in returns. If they upload all customer data to build lookalike audiences, Meta's algorithms would optimize for people likely to buy many dresses with high conversion rates. But the data would be misleading, generating target audiences even more likely to buy and return items, compounding the original problem and reducing profitability.

Savvy marketers are considerate of the conversion data they feed into learning models. Clean data that reflects your actual business goals will produce better results than comprehensive data that includes unwanted behaviors.

Key People & Works

Researchers & Authors

Pedro Domingos
Byron Sharp
Sam Tomlinson
Ben Thompson

Key Works

The Master Algorithm by Pedro Domingos
Attention Is All You Need by Google Research

Practical Applications

Understanding why Google Ads shows seemingly irrelevant search terms during learning phases
Working effectively with Meta's Lookalike Audiences by providing clean conversion data
Setting realistic expectations for AI-powered campaign optimization
Interpreting probabilistic results from marketing AI tools
Choosing between speed-focused and reasoning-focused AI tools for different marketing tasks

Frequently Asked Questions

What are the four main types of machine learning used in marketing AI?

The four types are Connectionists (neural networks that learn patterns like Google's search algorithm), Bayesians (statistical systems that update predictions with new data like automated bidding), Symbolists (rule-based systems that use logical deduction), and Analogizers (pattern-matching systems like Meta's Lookalike Audiences that find similar users).

Why do AI marketing tools sometimes show strange or irrelevant results?

AI marketing tools use probabilistic computing, which generates likely responses based on patterns rather than following exact rules. Strange results are often part of the learning process as the system tests different approaches to find what works best overall.

What's the difference between deterministic and probabilistic AI systems?

Deterministic systems follow strict rules and always produce the same output for the same input, like a calculator. Probabilistic systems use patterns to generate likely responses that may vary, like modern AI marketing tools that learn and adapt over time.

How did privacy changes affect AI advertising platforms?

Privacy regulations like GDPR and Apple's App Tracking Transparency forced platforms like Meta to shift from deterministic targeting based on exact user data to probabilistic models that make educated guesses about user behavior based on aggregate patterns and inferences.

Why is data quality so important for AI marketing tools?

AI systems like Lookalike Audiences will amplify whatever patterns exist in your data. If you feed them data that includes unwanted behaviors (like high return rates), the AI will optimize for more users with those same problematic characteristics.

What's the difference between reasoning AI models and standard language models?

Standard models like GPT-4 use pattern matching to predict the next word based on training data. Reasoning models work through problems step by step before generating responses, making them slower but better for complex, multi-step problems requiring logical precision.

From the Book

Chapter 26 dives deeper into the technical mechanics behind modern AI, exploring how the Transformer architecture revolutionized machine learning and why understanding probabilistic vs. deterministic computing is crucial for marketing success. Gilbert also examines reasoning models, the evolution of ad platform algorithms, and practical strategies for working effectively with AI systems.

Read the full technical breakdown in Chapter 26 of Never Always, Never Never.

Want to go deeper on this topic?

Chat with the AI companion to explore these concepts with the full context of the book.

Chat about this topic