Resume

Product intelligence

From Rules to Relationships: A Product Manager’s Guide to Machine Learning (Part I)

Dec 2, 2025

In the history of computing, we are currently living through a period often described as the deep learning boom. Tasks once considered out of reach, like highly accurate speech recognition, real-time translation, or medical imaging support, have become practical.

For decades, software progress was constrained by a simple bottleneck: humans had to write the rules. That worked well for systems where the world behaves predictably. It breaks down when the world is messy.

Now the advantage has shifted. It is less about who writes the most clever instructions and more about who builds the best systems for learning from experience. For product teams, understanding that shift is not a technical nice-to-have. It is foundational to building in a data-driven environment.

Learning versus programming

At its core, machine learning gives computers the ability to improve at a task by learning from examples rather than following hard-coded rules.

Traditional software looks like this:

Input data goes in
A human-authored ruleset runs
Output comes out

If you want to detect a hamburger, a rules-based approach might try to describe a hamburger explicitly: two buns, a patty, maybe lettuce. The problem is that reality does not respect definitions. Hamburgers come in endless variations, and so do edge cases: different lighting, different angles, messy packaging, missing ingredients, multiple objects in frame.

Machine learning flips the process:

Input data goes in
Example answers are provided
The model learns a rule-like mapping from inputs to outputs

You do not tell the system what a hamburger is. You show it many examples labeled “hamburger” and “not hamburger.” The model learns patterns that reliably distinguish them.

A useful mental model:

Traditional programming is rules first, data second.
Machine learning is data first, rules inferred.

Data: the raw material models learn from

Before a model can learn, you need data, meaning observations that can be represented numerically. Almost anything can be turned into numbers:

Text becomes sequences of tokens or embeddings
Images become pixel values or learned visual features
Audio becomes waveforms or spectral representations
Video becomes sequences of frames plus time structure

Inside most organizations, data tends to fall into two categories:

Structured data

Lives in relational databases or spreadsheets
Has rows and columns
Examples: transactions, subscriptions, CRM records, inventory tables
Easy to query, aggregate, and join

Unstructured data

Does not follow a fixed schema
Examples: emails, support tickets, call recordings, PDFs, images, videos
Harder to search and analyze without specialized processing

Many teams underestimate what “unstructured” really means in practice. It often includes semi-structured sources too, like logs or JSON blobs, which technically have a format but are inconsistent or hard to use without engineering work.

Data also has structure beyond “structured vs unstructured.” Two relationships matter a lot:

Temporal structure: events have order and time gaps matter (fraud detection, forecasting, user behavior)
Spatial structure: nearby pixels or locations tend to relate (images, maps, sensor data)

And not all numeric data is the same:

Continuous values: can take infinitely many values (temperature, height, time)
Discrete values: countable steps (number of purchases, number of sessions, age in years)

These distinctions shape which models work well and how you should evaluate them.

What a model actually is

A model is a mathematical approximation of how inputs relate to outputs.

A common way to express this is:

y = f(x) + ϵ

x (features): the inputs you feed the model
Example: square footage, neighborhood, number of bedrooms
y (target): what you want to predict
Example: sale price
ϵ (error or noise): everything the model cannot capture
Measurement errors, missing variables, randomness, and reality’s complexity

That error term is not a footnote. It is a reminder that models are not crystal balls. Even with perfect engineering, some uncertainty is irreducible.

To build a model, you typically define four things:

Features
Which inputs represent the problem. This includes how you encode them. “Country” could be a category, a vector, or a learned embedding. That choice matters.
Algorithm family
The general form of the model: linear regression, decision trees, gradient boosting, neural networks.
Hyperparameters
The configuration knobs that control complexity and behavior. Examples: tree depth, learning rate, regularization strength, number of layers.
Loss function
A measure of how wrong the model is. Training is essentially the process of adjusting parameters to minimize loss on historical data.

A practical translation for product work: training is optimization. You define what “good” means (loss), and the system tries to get better according to that definition.

The three pillars of machine learning

Most ML work in products fits into three buckets.

1) Supervised learning

You have labeled examples, meaning you know the “right answer” for each training observation.

Regression: predict a number
Example: predict revenue, delivery time, lifetime value
Classification: predict a category
Example: spam vs not spam, churn risk tiers, fraud vs legitimate

Supervised learning is the most common because it aligns well with business outcomes: there is usually something you can label or infer from history.

2) Unsupervised learning

You do not have labels. The model looks for structure in the data.

Common uses:

Clustering: group similar users or items
Example: customer segments based on behavior
Anomaly detection: find unusual patterns
Example: unusual transactions, spikes in system behavior, rare events

Unsupervised learning is often a discovery tool. It helps you ask better questions, even before you build a predictive system.

3) Reinforcement learning

A system learns by interacting with an environment, taking actions, and receiving feedback (rewards or penalties).

Examples:

Teaching an agent to play a game
Adaptive systems that optimize decisions over time

In product contexts, reinforcement learning is powerful but harder to deploy responsibly because it can change behavior as it learns, and the feedback loop can create unintended effects.

What machine learning is good for in products

For product managers, ML is a lever for:

Automation

Routing support tickets
Extracting information from documents
Transcribing and summarizing calls
Detecting policy violations or spam

Prediction

Forecasting demand
Detecting fraud risk
Predicting churn or conversion likelihood
Estimating delivery times or incident probability

Personalization

Recommendations and ranking
Search relevance
Tailored onboarding flows
Content prioritization and notifications

But the key word is “lever,” not “magic.” ML creates value when it is paired with a product system that can act on predictions safely and measurably.

Constraints and failure modes to take seriously

Machine learning comes with limitations that shape product decisions.

Data quality and quantity are gating factors
If your labels are noisy, inconsistent, or sparse, the system learns the wrong thing or learns nothing useful. Many ML projects fail because the modeling was fine and the data pipeline was not.

Correlation is not causation
Models detect patterns. They do not understand why patterns exist. A model might learn that ice cream sales correlate with certain incidents, but it cannot reason that both rise because of summer.

For product decisions, this matters because a model can be accurate and still misleading in terms of intervention. Predicting something is not the same as knowing how to change it.

Context is brittle
Models can struggle with nuance: sarcasm, humor, shifting language, changing user intent, and novel situations. Even strong models can fail in ways that feel obvious to a human.

Explanations are limited
Some model families are interpretable; many are not. Even when you can provide “feature importance,” it is often a partial story and can be misunderstood. If your product requires a defensible “why” for every decision, model choice matters.

The world changes
User behavior shifts, markets evolve, and policy changes. A model trained on last year’s data can degrade silently. This is not a one-and-done feature. It is a living system that requires monitoring and maintenance.

A practical playbook for getting started

When you initiate an ML-powered feature, these checks prevent painful misalignment later.

1) Define the target with precision
What exactly are you predicting or deciding? “Churn” is not a target until it is measurable: “user has not completed an action within 30 days” or “subscription canceled within 14 days.”

If you cannot define y clearly, the project is not ready.

2) Confirm you have historical labels
Supervised learning needs past outcomes. If you do not have them, you may need to create them, infer them, or start with an unsupervised approach.

3) Audit data formats and feasibility
If your core signal lives in unstructured text, audio, or video, do you have the tools and expertise to process it? Do you have access rights and privacy safeguards?

4) Check for time sensitivity
If behavior changes over time, train and evaluate accordingly. Do not randomly split time series data. You risk “training on the future” without noticing.

5) Choose the task type early
Classification and regression imply different success metrics and different modeling choices. Getting this wrong creates confusion in evaluation and stakeholder expectations.

Takeaways

Machine learning replaces explicit if-then rules with patterns learned from examples.
Data is the fuel, and unstructured data requires deliberate strategy to unlock.
Features (x) are inputs; targets (y) are outcomes. Clear definitions are the start of everything.
Every model is an approximation. The goal is not perfection, it is minimizing error in a way that produces product value.
Correlation is not causation. Accurate prediction does not automatically tell you how to intervene.
ML shines in automation, prediction, and personalization when paired with good product design, measurement, and monitoring.