From Rules to Relationships: A Product Manager’s Guide to Machine Learning (Part I)

Why ML exists

For decades, software worked the same way: a human wrote the rules, and the computer followed them. That works well when the world is predictable. It breaks down when it's messy.

If you want to detect a hamburger in an image, a rules-based approach might describe one explicitly: two buns, a patty, maybe lettuce. But hamburgers come in endless variations, and so do the conditions: different lighting, different angles, messy packaging, multiple objects in frame. You can't write enough rules to cover reality.

Machine learning solves this by flipping the process. Instead of writing rules and applying them to data, you provide examples with known answers and let the system infer the rules itself.

Traditional programming: rules first, data second. Machine learning: data first, rules inferred.

You don't tell the system what a hamburger is. You show it thousands of images labeled "hamburger" and "not hamburger," and it learns to distinguish them.

What a model actually is

A model is a mathematical approximation of how inputs relate to outputs. A simple way to express this:

y = f(x) + ε

  • x (features): the inputs you feed the model. Example: square footage, neighborhood, number of bedrooms.

  • y (target): what you want to predict. Example: sale price.

  • ε (error): everything the model can't capture. Measurement errors, missing variables, randomness, the irreducible complexity of reality.

That error term matters. No model is a crystal ball. Some uncertainty is always irreducible, which is why model outputs in products should inform decisions, not replace the judgment layer entirely.

To build a model, you typically define four things:

Features: which inputs represent the problem, and how you encode them. "Country" could be a category, a vector, or a learned embedding. That choice affects what the model can learn.

Algorithm family: the general form of the model. Linear regression, decision trees, gradient boosting, neural networks. Each makes different assumptions about how inputs relate to outputs.

Hyperparameters: the configuration knobs that control complexity and behavior. Tree depth, learning rate, regularization strength, number of layers. These are set before training, not learned from data.

Loss function: a measure of how wrong the model is. Training is the process of adjusting parameters to minimize loss on historical data. Defining loss is defining what "good" means for this specific problem.

Data: what models learn from

Before a model can learn, you need data: observations that can be represented numerically. Almost anything can be turned into numbers: text becomes sequences of tokens or embeddings, images become pixel values, audio becomes waveforms or spectral representations, video becomes sequences of frames plus time structure.

Inside most organizations, data falls into two broad categories.

Structured data lives in relational databases or spreadsheets. It has rows and columns. Examples: transactions, subscriptions, CRM records, inventory tables. Easy to query, aggregate, and join.

Unstructured data doesn't follow a fixed schema. Examples: emails, support tickets, call recordings, PDFs, images, videos. Harder to search and analyze without specialized processing. Semi-structured sources like logs or JSON blobs sit in between: technically formatted, but inconsistent or difficult to use without engineering work.

Two other data relationships matter for modeling:

Temporal structure: events have order, and time gaps matter. Fraud detection, forecasting, and user behavior analysis all depend on sequence. You can't treat timestamped data as if order is irrelevant.

Spatial structure: nearby pixels or locations tend to be related. Images, maps, and sensor data all have this property, and models for these tasks are designed to exploit it.

And not all numeric values are the same:

Continuous values can take infinitely many values within a range: temperature, height, time elapsed.

Discrete values come in countable steps: number of purchases, number of sessions, age in years.

These distinctions shape which models work well and how you should evaluate them.

The three types of ML

Supervised learning is the most common. You have labeled examples: you know the right answer for each training observation, and the model learns to predict it.

  • Regression: predicting a number. Revenue, delivery time, lifetime value.

  • Classification: predicting a category. Spam vs. not spam, churn risk tiers, fraud vs. legitimate.

Supervised learning aligns well with business outcomes because there's usually something you can label or infer from history.

Unsupervised learning has no labels. The model looks for structure in the data without being told what to find.

  • Clustering: group similar users or items. Customer segments based on behavior.

  • Anomaly detection: find unusual patterns. Unusual transactions, spikes in system behavior, rare events.

Unsupervised learning is often a discovery tool. It helps you ask better questions before you build a predictive system.

Reinforcement learning is different in kind. A system learns by taking actions in an environment and receiving feedback: rewards or penalties. It's used to train agents that play games, optimize decisions over time, or adapt behavior based on outcomes. Powerful in product contexts, but harder to deploy responsibly because the system changes its own behavior as it learns, and feedback loops can create unintended effects.

What ML is actually good for in products

Three categories where ML consistently adds value:

Automation: routing support tickets, extracting information from documents, transcribing and summarizing calls, detecting policy violations or spam.

Prediction: forecasting demand, detecting fraud risk, predicting churn or conversion likelihood, estimating delivery times or incident probability.

Personalization: recommendations and ranking, search relevance, tailored onboarding flows, content prioritization and notifications.

The key word is lever, not magic. ML creates value when it's paired with a product system that can act on predictions safely and measurably.

Failure modes worth knowing

Data quality is a gating factor. If your labels are noisy, inconsistent, or sparse, the system learns the wrong thing. Many ML projects fail not because the modeling was wrong but because the data pipeline was.

Correlation is not causation. Models detect patterns. They don't understand why patterns exist. A model might learn that ice cream sales correlate with certain incidents, not because one causes the other, but because both rise in summer. For product decisions this matters: predicting something is not the same as knowing how to change it. Accurate prediction does not automatically tell you how to intervene.

Context is brittle. Models can struggle with sarcasm, humor, shifting language, changing user intent, and novel situations. Even strong models fail in ways that feel obvious to a human.

Explanations are limited. Some model families are interpretable; many are not. Even when you can provide feature importance scores, it's often a partial story. If your product requires a defensible "why" for every decision, model choice matters.

The world changes. User behavior shifts, markets evolve, policy changes. A model trained on last year's data can degrade silently. ML features are living systems that require monitoring and maintenance, not one-time builds.

Questions to be able to answer before starting

Before committing to an ML-powered feature, these are the questions that surface misalignment early:

  • What exactly are you predicting or deciding? "Churn" is not a target until it's measurable: "user has not completed an action within 30 days" or "subscription canceled within 14 days."

  • Do you have historical labels? Supervised learning needs past outcomes. If you don't have them, you may need to create them, infer them, or start with an unsupervised approach.

  • Where does your core signal live? If it's in unstructured text, audio, or video, do you have the tools and access rights to process it?

  • Is the data time-sensitive? If behavior changes over time, you can't randomly split the data for training and evaluation. You risk training on the future without noticing.

  • Is this a classification or regression problem? These imply different success metrics and different modeling choices. Getting this wrong creates confusion in evaluation and stakeholder expectations.

Product Intelligence Atlas

Applied thinking on product and AI, from someone doing the work.

I started the Atlas as a place to put things I didn't want to lose. Notes from courses, prompts that actually worked, observations from client work that felt worth writing down. It grew from there. Now it's where I think through AI and product management in public: what I'm learning, what I'm building, what I think is worth paying attention to.

Product Intelligence Atlas

Applied thinking on product and AI, from someone doing the work.

I started the Atlas as a place to put things I didn't want to lose. Notes from courses, prompts that actually worked, observations from client work that felt worth writing down. It grew from there. Now it's where I think through AI and product management in public: what I'm learning, what I'm building, what I think is worth paying attention to.

Product Intelligence Atlas

Applied thinking on product and AI, from someone doing the work.

I started the Atlas as a place to put things I didn't want to lose. Notes from courses, prompts that actually worked, observations from client work that felt worth writing down. It grew from there. Now it's where I think through AI and product management in public: what I'm learning, what I'm building, what I think is worth paying attention to.

Product Intelligence Atlas

Applied thinking on product and AI, from someone doing the work.

I started the Atlas as a place to put things I didn't want to lose. Notes from courses, prompts that actually worked, observations from client work that felt worth writing down. It grew from there. Now it's where I think through AI and product management in public: what I'm learning, what I'm building, what I think is worth paying attention to.

Let's talk product

Maxime John · AI-fluent PM · Based in Germany, relocating to Portland, OR

Open to PM roles at US companies, remote now and on-site in Portland, OR from Q4 2026.

Job conversations, project ideas, and good product discussions all welcome.

Open to PM roles in the US

Available for remote work now

On-site in Portland, OR from Q4 2026

Let's talk product

Maxime John · AI-fluent PM · Based in Germany, relocating to Portland, OR

Open to PM roles at US companies, remote now and on-site in Portland, OR from Q4 2026.

Job conversations, project ideas, and good product discussions all welcome.

Open to PM roles in the US

Available for remote work now

On-site in Portland, OR from Q4 2026

Let's talk product

Maxime John · AI-fluent PM · Based in Germany, relocating to Portland, OR

Open to PM roles at US companies, remote now and on-site in Portland, OR from Q4 2026.

Job conversations, project ideas, and good product discussions all welcome.

Open to PM roles in the US

Available for remote work now

On-site in Portland, OR from Q4 2026

Let's talk product

Maxime John · AI-fluent PM · Based in Germany, relocating to Portland, OR

Open to PM roles at US companies, remote now and on-site in Portland, OR from Q4 2026.

Job conversations, project ideas, and good product discussions all welcome.

Open to PM roles in the US

Available for remote work now

On-site in Portland, OR from Q4 2026