What I Learned in My Machine Learning Class at Wake Forest University
I am 57 years old. My last university experience was a forgettable GPA at Ohio State. When I enrolled in AIN 720 at Wake Forest University's School of Business, I was nervous enough that I was embarrassed to submit my transcripts. I finished with a 99.7%.
Here is what happened in between.
The Big Idea
Machine learning sounds like something out of a science fiction movie. It is not. At its core, it is a structured way to use data to find patterns, make predictions, and answer business questions that most organizations are currently just guessing at.
Think of it this way. Every time Netflix recommends a show you end up loving, that is machine learning. Every time your credit card company flags a suspicious charge before you even notice it, that is machine learning. Every time Spotify builds a playlist that feels like it was made specifically for you, that is machine learning.
The computer is not psychic. It is just very good at finding patterns in data that humans would never have the time or patience to find on their own.
The course covered the full process from start to finish: how to build a model, how to evaluate it, how to tune it, how to explain it to a room full of non-technical people, and how to keep it running in the real world long after the initial excitement wears off.
The most important thing I learned was this: a model is only useful if it solves a real problem, can be explained, can be tested, and can be trusted. A model that nobody can understand is a model that nobody will use.
The Project: A Fictional Consulting Engagement
The class used a real dataset but framed it as a consulting engagement for H-E-B, the Texas grocery chain. To be clear, H-E-B was not actually involved. The company name was used to make the project feel like a real business problem rather than an abstract homework assignment.
The question we were trying to answer was: which customers are most likely to buy organic products?
The dataset contained 22,223 loyalty cardholder records with 19 pieces of information about each customer: things like age, spending habits, gender, and how long they had been a member. The target variable — the thing we were trying to predict — was simple: did this customer buy organic products or not?
All of the modeling and coding was done in Google Colab using Google Gemini AI as a coding assistant. Every model, every chart, and every result came from actual code running on actual data. The analysis, interpretation, business framing, and presentation were my own work throughout.
Milestone 1: Logistic Regression
The first model was called logistic regression. Do not let the name scare you.
Here is the plain English version. Logistic regression is the tool you use when you are trying to predict a yes or no answer. Will this customer buy? Will this loan default? Will this patient respond to treatment? The model looks at all the available information and says: based on what I know about this person, I think there is a 73% chance they will buy. You then decide what to do with that probability.
It is like asking a very experienced colleague who has seen thousands of similar situations to give you their honest read on whether a deal is going to close.
The model achieved 81.1% accuracy. To put that in perspective, if you randomly guessed whether each of the 22,223 customers would buy organic, you would be right about 75% of the time just by always saying no — because 75% of customers never bought organic at all. Getting to 81.1% with a model that actually finds the real buyers is meaningful.
But the accuracy number was not even the most interesting finding.
The single strongest predictor of organic buying behavior was Gender Unknown. Not how much the customer spent. Not how long they had been a loyalty member. Gender. Customers whose gender was not on file were dramatically less likely to buy organic.
And here is the finding that should make every marketer uncomfortable: loyalty tier and total spending had almost zero predictive value. The customers spending the most money in the store were not the ones buying organic. The biggest spenders were essentially irrelevant to the organic buying question.
How this applies to sales leadership:
In almost every sales organization I have ever worked in, we ranked prospects by company size or contract value. Biggest potential deal gets the most attention. This model taught me that biggest is not always most likely. Logistic regression applied to your CRM data — that is your customer relationship management system, the database where all your sales activity lives — could tell you which prospects are actually most likely to close. The answer might be completely different from what your gut is telling you.
Milestone 2: Classification Tree
The second model was a classification tree. This one is actually the easiest to understand because it works exactly like a flowchart.
Imagine you are trying to figure out who in your school is most likely to come to a Friday night concert. You would not just randomly invite everyone. You would ask a series of questions:
Do they like this type of music? If yes, keep going. If no, probably not coming.
Have they come to something like this before? If yes, very likely. If no, keep asking.
Do they live nearby? And so on.
A classification tree does exactly that with data. It finds the best questions to ask and the best order to ask them in, automatically, based on what actually predicts the outcome.
The very first question my model asked was about age: is this customer under 45? That single question split the entire dataset into two meaningfully different groups. Younger customers were dramatically more likely to buy organic. Everything else flowed from there.
The model achieved 81.5% accuracy. But the bigger win was what happened when I started trimming the tree.
The original dataset had 19 variables. When I stripped the model down to just 4, accuracy did not drop. It held. Four variables explained almost everything the model needed to know. The technical term for this is parsimony, which just means: use the simplest explanation that works. Do not add complexity unless it actually helps.
The four variables that mattered were Age, Affluence score, Gender Unknown, and Gender Male. Fifteen other variables contributed almost nothing. They were noise.
How this applies to sales leadership:
A classification tree built on your pipeline data could show you the exact decision path that leads to a closed deal. Is it the number of stakeholders involved in the first meeting? Whether a champion was identified in the first two weeks? Whether a technical evaluation happened before legal got involved? The model finds the path. You build the playbook around it and stop wasting time on activities that do not actually move the needle.
Milestone 3: Random Forest and Gradient Boosting
By Milestone 3 we were in the advanced section of the menu.
Random Forest is exactly what it sounds like. Instead of building one decision tree, you build 200 of them. Each tree gets trained on a random slice of the data and a random selection of the variables. Then all 200 trees vote on the answer and you go with the majority.
Why does this work? Because any one tree might make a mistake. But 200 trees making independent decisions and voting together are much harder to fool. It is the same reason a jury of twelve is more reliable than a single judge having a bad day.
My Random Forest confirmed something remarkable. All 200 trees, trained on different slices of data, independently ranked the same four variables at the top of their lists. Age. Affluence. Gender Unknown. Gender Male. Every single time. That is not luck. That is signal.
Gradient Boosting works differently. Instead of building 200 trees at once and having them vote, it builds trees one at a time. The first tree makes predictions. The second tree looks at where the first tree made mistakes and tries to fix them. The third tree fixes the second tree's mistakes. And so on. Each model in the chain learns from the errors of the one before it.
Think of it like a relay race where each runner studies the mistakes of the previous runner before they take the baton.
The final tuned Gradient Boosting model achieved 82.1% accuracy, with 99.2% of its predictive power coming from those same four variables. The other 15 variables contributed less than half a percent combined.
I also evaluated neural networks, which are the most famous and glamorous type of machine learning model. You have probably heard of them. They are what powers ChatGPT and image recognition and self-driving cars. They are genuinely powerful.
I rejected them anyway.
They offered no meaningful accuracy improvement over Gradient Boosting on this dataset, and their results were essentially impossible to explain to a non-technical business audience. A neural network cannot tell you why it made a prediction. It just does. And if you cannot explain why a customer was flagged, a marketing team will never trust the model enough to act on it. Gradient Boosting won because it could be explained in plain language to anyone in the room.
How this applies to sales leadership:
Churn prediction is the most obvious application here. In a SaaS sales organization, a Gradient Boosting model trained on renewal data could identify at-risk customers weeks or months before they send the cancellation email. More importantly it tells you which variables are driving the risk. Is it low product usage? A recent support escalation? A change in the economic buyer? That is the difference between knowing a customer is unhappy and knowing exactly what to do about it.
Milestone 4: Customer Segmentation
The fourth milestone used a completely different type of machine learning called clustering, and specifically a technique called k-means clustering.
Here is the key difference between what happened in Milestones 1 through 3 and what happened in Milestone 4.
In the first three milestones, the model already knew what it was looking for. We told it: predict whether a customer buys organic or not. That is called supervised learning, because the model is being supervised with the right answer already known.
In Milestone 4, the model was given data and told: find the natural groups. No right answer provided. No categories pre-defined. Just find whatever structure is hiding in the data. That is called unsupervised learning, because nobody is supervising it with the correct labels.
Think of it like dumping a giant pile of mixed candy on a table and telling someone to sort it however makes sense to them. Nobody tells them the categories. They just start grouping: chocolate here, sour there, gummies in the corner, hard candy by the wall. The groups emerge from the candy itself, not from instructions.
I ran the clustering twice, and the difference between the two runs is one of the most important lessons in the entire project.
The first time I used all 19 variables. The result was five customer groups that were basically identical. Buy rates between 21% and 29% across the board. No meaningful differences. The algorithm was picking up on promotional spending history and loyalty tenure, not on organic buying behavior. Technically a result. Completely useless for marketing purposes.
So I went back to the four variables that every previous model had already confirmed mattered: Age, Affluence, Gender Unknown, and Gender Male. I re-ran the clustering on those four variables alone.
The result was completely different. Five genuinely distinct customer segments emerged, with buy rates ranging from 7% all the way to 65%. I gave each segment a name and built a persona around them so a marketing team could actually use the findings.
How this applies to sales leadership:
Every sales organization has its own version of these five segments sitting inside the CRM right now. You have your Wendys: customers who are already all in and just need to be protected and grown. You have your Briannas: prospects who are interested but need one good reason to commit. You have your Susans: accounts that have been around forever and are just waiting for someone to show up with the right offer. You have your Mikes: accounts where there is some potential but you have to be selective about where you invest. And you have your Walters and Bettys: accounts that are consuming sales resources without any realistic path to growth. Clustering tells you which bucket every account belongs in so you stop treating all of them the same way.
Milestone 5: The Final Presentation
The final milestone pulled everything together into an executive presentation for a non-technical business audience. No data science jargon. No confusion matrices. Just the business story the data was telling and what to do about it.
The central insight was about growth strategy. 75.2% of loyalty cardholders had never bought a single organic product. The instinct most marketing teams would have is to market harder to the people already buying. But the math told a different story.
Converting just 5% of non-buyers per quarter, while simultaneously running basket-deepening campaigns among existing buyers in Wendy and Brianna's segments, could move the organic buy rate from 24.8% to over 40% within four quarters. The model did not just identify who to target. It showed what consistent targeting was worth over time, quarter by quarter, compounding.
The presentation also included a full MLOps plan. MLOps stands for Machine Learning Operations, and it is the answer to the question nobody asks until it is too late: the model works great today. What happens in six months when the data has changed?
A model built on last year's customer behavior will drift. The segments will shift. The predictions will get less accurate. And if nobody is monitoring it, campaigns will keep running against a model that no longer reflects reality. The plan included a quarterly retrain cycle, clear ownership across data science, engineering, and marketing teams, an annual checkpoint to challenge whether the model was still asking the right questions, and ethical guardrails ensuring the segments were built on behavioral signals rather than anything that could unfairly profile customers.
How this applies to sales leadership:
The MLOps mindset is exactly what is missing from most sales forecasting processes. Sales teams build a forecast model — or more often just use gut instinct dressed up as a forecast — and then run it unchanged until someone notices it is badly wrong. A disciplined retrain cycle applied to pipeline data means your predictions get better over time instead of quietly getting worse. And the quarterly review checkpoint is just good management: stop and ask whether the model is still answering the right question, because the business changes and the model has to change with it.
What Sales Leadership Looks Like With a Machine Learning Mindset
I want to spend a moment on this specifically, because it is the reason I enrolled in this program.
I have spent my career in sales and sales leadership. I have sat in hundreds of forecast calls, pipeline reviews, and board presentations. And in almost every one of them, the same dynamic plays out: everyone in the room looks at the Salesforce dashboard, nods confidently at the numbers, and quietly pretends to understand what is actually driving performance.
Nobody is looking under the hood.
The questions that haunt every CRO and CFO are almost always the same. Why is our renewal rate declining? Why are so many opportunities stalling at the final stage? Why does one sales rep consistently outperform the rest with the same territory and the same product? Which accounts are most likely to expand and which ones are quietly at risk?
These are not mysteries. They are machine learning problems. The data to answer every one of those questions already exists in Salesforce. It has been sitting there for years. It just has not been modeled.
Here is what a machine learning approach to sales leadership actually looks like in practice.
What I Actually Learned
Machine learning is not a technical skill reserved for people who are naturally good at math or who spent their twenties in a computer science program. It is a business discipline. The model is one piece. The real value comes from knowing what question to ask, which tool fits the problem, how to evaluate whether the answer is trustworthy, and how to explain the result to a room full of skeptical executives who have heard a lot of promises from a lot of technology vendors.
I went into this course nervous and genuinely embarrassed about my academic history. I came out with a 99.7% and a fundamentally different way of looking at every business problem I will ever face.
The data is already there. In your CRM. In your renewal history. In your support tickets. In your usage logs. It has been sitting there for years, patiently waiting for someone to stop pretending to understand it and start actually modeling it.
That someone can be you. It does not require being a coder. It requires curiosity, a willingness to challenge your assumptions, and the humility to let the data tell you something you did not expect to hear.
That is the whole lesson.
All modeling and analysis was performed in Google Colab using Google Gemini AI as a coding assistant. The H-E-B name was used as a fictional framing device for educational purposes only. The dataset and all findings are entirely independent of any actual H-E-B business operations or data.
I am 57 years old. My last university experience was a forgettable GPA at Ohio State. When I enrolled in AIN 720 at Wake Forest University's School of Business, I was nervous enough that I was embarrassed to submit my transcripts. I finished with a 99.7%.
Here is what happened in between.
The Big Idea
Machine learning sounds like something out of a science fiction movie. It is not. At its core, it is a structured way to use data to find patterns, make predictions, and answer business questions that most organizations are currently just guessing at.
Think of it this way. Every time Netflix recommends a show you end up loving, that is machine learning. Every time your credit card company flags a suspicious charge before you even notice it, that is machine learning. Every time Spotify builds a playlist that feels like it was made specifically for you, that is machine learning.
The computer is not psychic. It is just very good at finding patterns in data that humans would never have the time or patience to find on their own.
The course covered the full process from start to finish: how to build a model, how to evaluate it, how to tune it, how to explain it to a room full of non-technical people, and how to keep it running in the real world long after the initial excitement wears off.
The most important thing I learned was this: a model is only useful if it solves a real problem, can be explained, can be tested, and can be trusted. A model that nobody can understand is a model that nobody will use.
The Project: A Fictional Consulting Engagement
The class used a real dataset but framed it as a consulting engagement for H-E-B, the Texas grocery chain. To be clear, H-E-B was not actually involved. The company name was used to make the project feel like a real business problem rather than an abstract homework assignment.
The question we were trying to answer was: which customers are most likely to buy organic products?
The dataset contained 22,223 loyalty cardholder records with 19 pieces of information about each customer: things like age, spending habits, gender, and how long they had been a member. The target variable — the thing we were trying to predict — was simple: did this customer buy organic products or not?
All of the modeling and coding was done in Google Colab using Google Gemini AI as a coding assistant. Every model, every chart, and every result came from actual code running on actual data. The analysis, interpretation, business framing, and presentation were my own work throughout.
Milestone 1: Logistic Regression
The first model was called logistic regression. Do not let the name scare you.
Here is the plain English version. Logistic regression is the tool you use when you are trying to predict a yes or no answer. Will this customer buy? Will this loan default? Will this patient respond to treatment? The model looks at all the available information and says: based on what I know about this person, I think there is a 73% chance they will buy. You then decide what to do with that probability.
It is like asking a very experienced colleague who has seen thousands of similar situations to give you their honest read on whether a deal is going to close.
The model achieved 81.1% accuracy. To put that in perspective, if you randomly guessed whether each of the 22,223 customers would buy organic, you would be right about 75% of the time just by always saying no — because 75% of customers never bought organic at all. Getting to 81.1% with a model that actually finds the real buyers is meaningful.
But the accuracy number was not even the most interesting finding.
The single strongest predictor of organic buying behavior was Gender Unknown. Not how much the customer spent. Not how long they had been a loyalty member. Gender. Customers whose gender was not on file were dramatically less likely to buy organic.
And here is the finding that should make every marketer uncomfortable: loyalty tier and total spending had almost zero predictive value. The customers spending the most money in the store were not the ones buying organic. The biggest spenders were essentially irrelevant to the organic buying question.
How this applies to sales leadership:
In almost every sales organization I have ever worked in, we ranked prospects by company size or contract value. Biggest potential deal gets the most attention. This model taught me that biggest is not always most likely. Logistic regression applied to your CRM data — that is your customer relationship management system, the database where all your sales activity lives — could tell you which prospects are actually most likely to close. The answer might be completely different from what your gut is telling you.
Milestone 2: Classification Tree
The second model was a classification tree. This one is actually the easiest to understand because it works exactly like a flowchart.
Imagine you are trying to figure out who in your school is most likely to come to a Friday night concert. You would not just randomly invite everyone. You would ask a series of questions:
Do they like this type of music? If yes, keep going. If no, probably not coming.
Have they come to something like this before? If yes, very likely. If no, keep asking.
Do they live nearby? And so on.
A classification tree does exactly that with data. It finds the best questions to ask and the best order to ask them in, automatically, based on what actually predicts the outcome.
The very first question my model asked was about age: is this customer under 45? That single question split the entire dataset into two meaningfully different groups. Younger customers were dramatically more likely to buy organic. Everything else flowed from there.
The model achieved 81.5% accuracy. But the bigger win was what happened when I started trimming the tree.
The original dataset had 19 variables. When I stripped the model down to just 4, accuracy did not drop. It held. Four variables explained almost everything the model needed to know. The technical term for this is parsimony, which just means: use the simplest explanation that works. Do not add complexity unless it actually helps.
The four variables that mattered were Age, Affluence score, Gender Unknown, and Gender Male. Fifteen other variables contributed almost nothing. They were noise.
How this applies to sales leadership:
A classification tree built on your pipeline data could show you the exact decision path that leads to a closed deal. Is it the number of stakeholders involved in the first meeting? Whether a champion was identified in the first two weeks? Whether a technical evaluation happened before legal got involved? The model finds the path. You build the playbook around it and stop wasting time on activities that do not actually move the needle.
Milestone 3: Random Forest and Gradient Boosting
By Milestone 3 we were in the advanced section of the menu.
Random Forest is exactly what it sounds like. Instead of building one decision tree, you build 200 of them. Each tree gets trained on a random slice of the data and a random selection of the variables. Then all 200 trees vote on the answer and you go with the majority.
Why does this work? Because any one tree might make a mistake. But 200 trees making independent decisions and voting together are much harder to fool. It is the same reason a jury of twelve is more reliable than a single judge having a bad day.
My Random Forest confirmed something remarkable. All 200 trees, trained on different slices of data, independently ranked the same four variables at the top of their lists. Age. Affluence. Gender Unknown. Gender Male. Every single time. That is not luck. That is signal.
Gradient Boosting works differently. Instead of building 200 trees at once and having them vote, it builds trees one at a time. The first tree makes predictions. The second tree looks at where the first tree made mistakes and tries to fix them. The third tree fixes the second tree's mistakes. And so on. Each model in the chain learns from the errors of the one before it.
Think of it like a relay race where each runner studies the mistakes of the previous runner before they take the baton.
The final tuned Gradient Boosting model achieved 82.1% accuracy, with 99.2% of its predictive power coming from those same four variables. The other 15 variables contributed less than half a percent combined.
I also evaluated neural networks, which are the most famous and glamorous type of machine learning model. You have probably heard of them. They are what powers ChatGPT and image recognition and self-driving cars. They are genuinely powerful.
I rejected them anyway.
They offered no meaningful accuracy improvement over Gradient Boosting on this dataset, and their results were essentially impossible to explain to a non-technical business audience. A neural network cannot tell you why it made a prediction. It just does. And if you cannot explain why a customer was flagged, a marketing team will never trust the model enough to act on it. Gradient Boosting won because it could be explained in plain language to anyone in the room.
How this applies to sales leadership:
Churn prediction is the most obvious application here. In a SaaS sales organization, a Gradient Boosting model trained on renewal data could identify at-risk customers weeks or months before they send the cancellation email. More importantly it tells you which variables are driving the risk. Is it low product usage? A recent support escalation? A change in the economic buyer? That is the difference between knowing a customer is unhappy and knowing exactly what to do about it.
Milestone 4: Customer Segmentation
The fourth milestone used a completely different type of machine learning called clustering, and specifically a technique called k-means clustering.
Here is the key difference between what happened in Milestones 1 through 3 and what happened in Milestone 4.
In the first three milestones, the model already knew what it was looking for. We told it: predict whether a customer buys organic or not. That is called supervised learning, because the model is being supervised with the right answer already known.
In Milestone 4, the model was given data and told: find the natural groups. No right answer provided. No categories pre-defined. Just find whatever structure is hiding in the data. That is called unsupervised learning, because nobody is supervising it with the correct labels.
Think of it like dumping a giant pile of mixed candy on a table and telling someone to sort it however makes sense to them. Nobody tells them the categories. They just start grouping: chocolate here, sour there, gummies in the corner, hard candy by the wall. The groups emerge from the candy itself, not from instructions.
I ran the clustering twice, and the difference between the two runs is one of the most important lessons in the entire project.
The first time I used all 19 variables. The result was five customer groups that were basically identical. Buy rates between 21% and 29% across the board. No meaningful differences. The algorithm was picking up on promotional spending history and loyalty tenure, not on organic buying behavior. Technically a result. Completely useless for marketing purposes.
So I went back to the four variables that every previous model had already confirmed mattered: Age, Affluence, Gender Unknown, and Gender Male. I re-ran the clustering on those four variables alone.
The result was completely different. Five genuinely distinct customer segments emerged, with buy rates ranging from 7% all the way to 65%. I gave each segment a name and built a persona around them so a marketing team could actually use the findings.
- Whole Paycheck Wendy has a 65% organic buy rate. She is younger, highly affluent, and almost exclusively female. She is not shopping at H-E-B because it is the cheapest option. She expects a certain standard. Strategy: protect her loyalty and deepen her basket. She does not need a coupon. She needs to feel like the organic section was built for her.
- Budget Conscious Brianna has a 32% buy rate. She is almost identical to Wendy in age and lifestyle. She cares about what goes into her cart. But her affluence score is significantly lower, and every dollar matters. She is one good digital coupon away from becoming a regular organic buyer. Strategy: convert her with a targeted offer and measure whether she comes back.
- Steady Susan has a 20% buy rate. She has shopped the same store every Tuesday since before some of the employees were born. She is not opposed to organics. She just needs someone to make it easy and obvious. Strategy: register coupons, Sunday newspaper inserts, and in-store signage in the aisles she already walks. Low cost. Low friction.
- Meat and Potatoes Mike has a 15% buy rate. He came for the ribs and a six pack. Organic was not on the list. But 15% of Mikes are already buying organic without thinking of themselves as organic shoppers. They grabbed the organic ground beef because it was on sale. They picked up the organic dog treats because they were right there at eye level. Strategy: smart product placement in the categories he already shops. Do not try to convert Mike wholesale. Just put the right product in front of him at the right moment.
- Walter and Betty have a 7% buy rate. The loyalty card is in the glove compartment under the owner's manual behind a coupon for a restaurant that closed in 2019. Strategy: do not spend the organic marketing budget here. Redirect it to where the data says it will actually work.
How this applies to sales leadership:
Every sales organization has its own version of these five segments sitting inside the CRM right now. You have your Wendys: customers who are already all in and just need to be protected and grown. You have your Briannas: prospects who are interested but need one good reason to commit. You have your Susans: accounts that have been around forever and are just waiting for someone to show up with the right offer. You have your Mikes: accounts where there is some potential but you have to be selective about where you invest. And you have your Walters and Bettys: accounts that are consuming sales resources without any realistic path to growth. Clustering tells you which bucket every account belongs in so you stop treating all of them the same way.
Milestone 5: The Final Presentation
The final milestone pulled everything together into an executive presentation for a non-technical business audience. No data science jargon. No confusion matrices. Just the business story the data was telling and what to do about it.
The central insight was about growth strategy. 75.2% of loyalty cardholders had never bought a single organic product. The instinct most marketing teams would have is to market harder to the people already buying. But the math told a different story.
Converting just 5% of non-buyers per quarter, while simultaneously running basket-deepening campaigns among existing buyers in Wendy and Brianna's segments, could move the organic buy rate from 24.8% to over 40% within four quarters. The model did not just identify who to target. It showed what consistent targeting was worth over time, quarter by quarter, compounding.
The presentation also included a full MLOps plan. MLOps stands for Machine Learning Operations, and it is the answer to the question nobody asks until it is too late: the model works great today. What happens in six months when the data has changed?
A model built on last year's customer behavior will drift. The segments will shift. The predictions will get less accurate. And if nobody is monitoring it, campaigns will keep running against a model that no longer reflects reality. The plan included a quarterly retrain cycle, clear ownership across data science, engineering, and marketing teams, an annual checkpoint to challenge whether the model was still asking the right questions, and ethical guardrails ensuring the segments were built on behavioral signals rather than anything that could unfairly profile customers.
How this applies to sales leadership:
The MLOps mindset is exactly what is missing from most sales forecasting processes. Sales teams build a forecast model — or more often just use gut instinct dressed up as a forecast — and then run it unchanged until someone notices it is badly wrong. A disciplined retrain cycle applied to pipeline data means your predictions get better over time instead of quietly getting worse. And the quarterly review checkpoint is just good management: stop and ask whether the model is still answering the right question, because the business changes and the model has to change with it.
What Sales Leadership Looks Like With a Machine Learning Mindset
I want to spend a moment on this specifically, because it is the reason I enrolled in this program.
I have spent my career in sales and sales leadership. I have sat in hundreds of forecast calls, pipeline reviews, and board presentations. And in almost every one of them, the same dynamic plays out: everyone in the room looks at the Salesforce dashboard, nods confidently at the numbers, and quietly pretends to understand what is actually driving performance.
Nobody is looking under the hood.
The questions that haunt every CRO and CFO are almost always the same. Why is our renewal rate declining? Why are so many opportunities stalling at the final stage? Why does one sales rep consistently outperform the rest with the same territory and the same product? Which accounts are most likely to expand and which ones are quietly at risk?
These are not mysteries. They are machine learning problems. The data to answer every one of those questions already exists in Salesforce. It has been sitting there for years. It just has not been modeled.
Here is what a machine learning approach to sales leadership actually looks like in practice.
- Churn prediction means building a classification model on your renewal data. Feed in every signal you have: product usage, support ticket volume, stakeholder changes, payment history, NPS scores, engagement with the customer success team. The model tells you which accounts are most likely to churn and which variables are driving the risk. You do not wait for the cancellation email. You intervene months earlier when there is still time to do something about it.
- Pipeline conversion modeling means building a model on your closed-won and closed-lost data to find out which early signals actually predict a successful close. Most sales organizations think they know what makes a deal close. A model built on historical data will often tell you something completely different. Maybe the number of stakeholders in the first meeting matters more than deal size. Maybe the timing of the technical evaluation is the strongest signal. The model finds the pattern. You build the playbook.
- Rep performance analysis means using clustering to segment your sales team the same way I segmented grocery customers. Some reps are Wendys: high performers who need to be developed and protected. Some are Briannas: strong potential who need one specific thing to unlock the next level. Some are Mikes: solid contributors who should not be managed the same way as your enterprise hunters. Treating all of them with the same management approach is the equivalent of running the same marketing campaign at Wendy and Walter and Betty simultaneously and wondering why the results are disappointing.
- Territory and account prioritization means applying a model to your total addressable market to score accounts by likelihood to buy before your reps ever pick up the phone. Instead of having reps self-select which accounts to pursue based on gut feel, you give them a ranked list based on the variables that have actually predicted closed deals historically. You stop leaving prioritization to individual hunches and start making it a data-driven process.
What I Actually Learned
Machine learning is not a technical skill reserved for people who are naturally good at math or who spent their twenties in a computer science program. It is a business discipline. The model is one piece. The real value comes from knowing what question to ask, which tool fits the problem, how to evaluate whether the answer is trustworthy, and how to explain the result to a room full of skeptical executives who have heard a lot of promises from a lot of technology vendors.
I went into this course nervous and genuinely embarrassed about my academic history. I came out with a 99.7% and a fundamentally different way of looking at every business problem I will ever face.
The data is already there. In your CRM. In your renewal history. In your support tickets. In your usage logs. It has been sitting there for years, patiently waiting for someone to stop pretending to understand it and start actually modeling it.
That someone can be you. It does not require being a coder. It requires curiosity, a willingness to challenge your assumptions, and the humility to let the data tell you something you did not expect to hear.
That is the whole lesson.
All modeling and analysis was performed in Google Colab using Google Gemini AI as a coding assistant. The H-E-B name was used as a fictional framing device for educational purposes only. The dataset and all findings are entirely independent of any actual H-E-B business operations or data.