29,99 €
Most marketing professionals are familiar with various sources of customer data that promise insights for success. There are extensive sources of data, from customer surveys to digital marketing data. Moreover, there is an increasing variety of tools and techniques to shape data, from small to big data. However, having the right knowledge and understanding the context of how to use data and tools is crucial.
In this book, you’ll learn how to give context to your data and turn it into useful information. You’ll understand how and where to use a tool or dataset for a specific question, exploring the "what and why questions" to provide real value to your stakeholders. Using Python, this book will delve into the basics of analytics and causal inference. Then, you’ll focus on visualization and presentation, followed by understanding guidelines on how to present and condense large amounts of information into KPIs. After learning how to plan ahead and forecast, you’ll delve into customer analytics and insights. Finally, you’ll measure the effectiveness of your marketing efforts and derive insights for data-driven decision-making.
By the end of this book, you’ll understand the tools you need to use on specific datasets to provide context and shape your data, as well as to gain information to boost your marketing efforts.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 580
Veröffentlichungsjahr: 2024
Data Analytics for Marketing
A practical guide to analyzing marketing data using Python
Guilherme Diaz-Bérrio
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Kaustubh Manglurkar
Publishing Product Manager: Heramb Bhavsar
Book Project Managers: Farheen Fathima and Shambhavi Mishra
Senior Editor: Rohit Singh
Technical Editor: Rahul Limbachiya
Copy Editor: Safis Editing
Proofreaders: Rohit Singh and Safis Editing
Indexer: Subalakshmi Govindhan
Production Designer: Joshua Misquitta
DevRel Marketing Executive: Nivedita Singh
First published: May 2024
Production reference: 1120424
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80324-160-9
www.packtpub.com
To Andreia Lopes, my partner. You are my safe harbor; this book would not have become a reality without your support during the endless hours I thought about giving up. You make me want to be better, and I would not be where I am without you. Love you!
– Guilherme Diaz-Bérrio
Guilherme Diaz-Bérrio is the Head of Marketing Analytics at Kindred Group, one of the 10 largest gambling operators. He helps improve marketing efforts across various platforms. His career started in finance at a hedge fund and moved through the automotive industry at BMW Group and BMW Financial Services, before coming to Kindred Group. He graduated with a degree in economics from ISEG, University of Lisbon, and has additional training in data science and econometrics. He is also the co-founder of Pinemarsh, a data analytics and digital marketing consulting firm.
I want to thank my editor, Rohit Singh, for his help and patience in reviewing my drafts. I also want to thank Birjees Patel and Deepesh Patel, for taking a chance and inviting me to write this book. I would like to thank the reviewers, Devanshu Tayal, Shubham Gupta, and Michael Van Den Reym. Their feedback was incredibly helpful in getting to the final drafts. Finally, I would like to thank Krishna Bhaskaran, for giving me the opportunity to work in marketing analytics and giving me the foundations of what I know today.
Devanshu Tayal is a highly accomplished data scientist with a master’s degree from BITS, Pilani, India. His extensive expertise in data science is evidenced by his contributions to a wide range of industries. Devanshu is deeply committed to mentoring and guiding aspiring data scientists and is an avid researcher of emerging technologies in the field. He is a strong advocate for diversity and inclusion and has shared his insights through various publications. Devanshu is frequently invited to deliver guest lectures at universities throughout India, and his contributions as a technical reviewer have been acknowledged in multiple books. His comprehensive knowledge and experience in the field make him an asset to any team or project.
Shubham Gupta, an accomplished technology leader and a staunch advocate for data-driven decision-making, possesses a vast wealth of expertise spanning analytics, business intelligence, strategic planning, and cutting-edge innovative solutions. His deep comprehension of both technological intricacies and business dynamics equips him to guide business stakeholders from a wide array of industries toward making well-informed, data-backed decisions, thus streamlining operations and fostering substantial growth.
Furthermore, Shubham’s significant involvement as a judge for numerous prestigious tech awards highlights his unwavering dedication to promoting excellence and driving innovation within the tech industry.
Michael Van Den Reym is a seasoned professional in the field of digital analytics and search engine optimization, with a rich background in enhancing online visibility for diverse businesses. Currently, he’s working at iO, the biggest full-service digital agency in Belgium. Michael has been a speaker on data-driven marketing at industry conferences such as MeasureCamp, MeasureFest, and BrightonSEO. Michael’s work primarily revolves around leveraging data-driven insights using Python and data visualization tools to create strategies that significantly improve digital marketing outcomes.
Currently, Michael is working on his first book, Fundamentals of SEO for Business, which revolves around search engine optimization for marketing professionals.
In this part, we will go through the fundamentals of analytics, introducing marketing analytics as a discipline. We will be focusing on data extraction, ingestion, and exploratory data analysis, followed by techniques for effectively presenting results and building dashboards for non-technical audiences. The subsequent discussion shifts toward econometrics and causal inference, providing a foundational understanding of statistics and equipping you with the skills to construct, test, and evaluate statistical models, emphasizing their significance and application in marketing.
This part contains the following chapters:
Chapter 1, What is Marketing Analytics?Chapter 2, Extracting and Exploring Data with Singer and pandasChapter 3, Design Principles and Presenting Results with StreamlitChapter 4, Econometrics and Causal Inference with Statsmodels and PyMCHalf the money I spend on advertising is wasted; the trouble is I don’t know which half.
– John Wanamaker, the forefather of marketing
In this chapter, we will attempt to cover the fundamentals of marketing analytics as a role and discipline. As a marketing analyst, you are faced with common questions during your day-to-day activities. For example, “How did this campaign perform?” or “How can you optimize your budget to achieve a result?”.
In this chapter, we will break down the types of analytics (from descriptive to prescriptive), the value they add to a business, and the questions each of them answers.
You will learn about the following topics:
What is analytics?An overview of marketing analyticsExploring different types of analyticsBeyond simple pivot tablesWhy Python?Modern challenges in the world of privacy-centric marketingThe importance of data engineering and trackingBy the end of this chapter, you will understand what marketing analytics is and what it is supposed to measure. You will have a firm grasp of the different types of analytics and why simply using a spreadsheet, while tempting, is sometimes not enough. You will also have an understanding of the importance of data engineering and web tracking.
But before we delve into the tools and techniques that are required of you, to achieve your results, we first need to unpack what we mean by analytics in general and marketing analytics in particular.
Like any buzzword, analytics can often be overused and hard to define from an exact source. According to the Oxford Dictionary, the textbook definition of analytics is “the systematic computational analysis of data or statistics, in order to describe, predict, and improve business performance”. Gartner defines it more broadly as “statistical and mathematical data analysis that clusters, segments, scores, and predicts what scenarios are most likely to happen.”
Analytics is commonly known to branch out into four pillars or areas: descriptive analytics, diagnostic analytics, predictive analytics, and prescriptive analytics.
In essence, analytics is the act of extracting meaningful and actionable insights from data by using a set of techniques and tools paired with domain knowledge. Raw data, however large it may be, will not be a silver bullet in your quest for insights in marketing. Neither will advanced techniques and a lack of domain knowledge of how the field you are analyzing operates. It is only in the joining of these three aspects—domain knowledge, data, and techniques—that you will be able to do your job.
An important caveat about analytics is that it is neither reporting nor data science. Your primary job is not to produce a stable, automatically updated dashboard or a machine learning pipeline. In analytics, it helps you to have the right mindset and try to achieve reproducible code for follow-up analysis, or have a stable pipeline. But as an analyst, that is not your primary goal; it is just nice to have in order to achieve your end result. Your primary goal is speed and accuracy; that is, you need to produce meaningful insights that teams easily rely on in a reasonable amount of time with reasonable accuracy.
While it may seem controversial, this distinction bears some thought. Often, analytics will be folded into one of the two extremes. Either it is viewed as simple reporting and/or BI work, meaning you will lose the ability to generate actionable insights due to the rigid nature of datasets and dashboard architecture required. Or, it is viewed as data science, which means you will often use complex models that require a lot of data to learn and are lacking in interpretability. Analytics stands in the middle, although with blurry frontiers:
Figure 1.1 – Business analytics as the intersection of several skills
Now let's gain an overview of what is marketing analytics.
The quote at the beginning of this chapter illustrates one of the fundamental questions of the marketing manager in their day-to-day activities. The best way to evaluate where to spend and target their efforts to achieve their ultimate target is to obtain new customers or retain current ones.
Marketing analytics is nothing more than the application of analytical methods to said goal, bringing a quantifiable way of guiding investment or consumer targeting decisions. As with any new and growing domain, it is hard to pin an exact definition of it, but we can define it as a “technology-enabled and model-supported approach to harness customer and market data to enhance marketing decision making”. Being a domain in the larger field of data analytics, it looks to use mathematics and statistics together with computational tools and techniques to find meaningful patterns and knowledge in data. In this book, we will strive to focus on only the most relevant techniques and models to solve fundamental questions in marketing analytics.
Standard techniques frequently used in marketing include media mix modeling, pricing and promotion analyses, sales force optimization, and customer analytics such as segmentation or lifetime value estimation. The optimization of websites and online campaigns now frequently works hand in hand with the more traditional marketing analysis techniques, coupled with attribution modeling and media mix modeling, to understand channel interactions and optimal budget allocation.
These tools and techniques will allow you to support both strategic marketing decisions—such as how much to spend on marketing and how to allocate budgets across a portfolio of brands and the marketing mix—and more tactical campaign support in terms of targeting the best potential customer with the optimal message in the most cost-effective medium at the ideal time.
The past decades have seen an explosion of data in a digital format, with some estimates pointing to a jump from 6 percent to around 90 percent. That, together with massive improvements in computational tools such as faster databases, inference algorithms, and easier programming languages for statistics means the dramatic improvement and evolution of marketing analytics in recent times. But one might wonder why we should be concerned with marketing analytics, or why it should be regarded as an independent sub-field of greater analytics.
Any business that employs analytics, of any kind, expects that it will improve the performance of said business. Marketing analytics is no different. Evidence supports the claim that marketing analytics improves business performance, be it in the form of increased sales, profits, or market share.
One study states that for a one-unit increase in marketing analytics deployment (measured on a scale of 1–7), there is an increase of 8 percent in return on assets (ROA) for the business, accounting for $180 million in net income. Businesses in highly competitive industries gain even more; an increase of 21 percent on ROA.
Let’s see a simple example of what marketing analytics can do with a mail coupon campaign. Kroger, an American retailer, conducts regular direct mail coupon campaigns. These campaigns to customers delivered a redemption rate of 70 percent within six weeks of mailing compared to an average of 7.93 percent for other companies. How? According to the analytics company working with Kroger, “Demographics can tell you nothing about it. Just because I am the same age as you, live next door, and have 2.2 children does not mean we have the same preferences.” What they do is study each customer to see what drives their behaviors individually. Do they have kids, do they skew toward healthy or fun, do they prefer organic or convenience foods, and where are they price sensitive? Is this across all products or only some? “We tell our retailing customers there is no silver bullet. Take data from customers and look at the decisions the business is making and look at their impact on the consumer.”
Having discussed the what and why of marketing analytics, we need to take a small detour to explain the different types of analytics in the analytical maturity model to better understand what to apply in each step.
As we have seen earlier, analytics is a broad term covering four different pillars in the modern analytics ladder. Each plays a role in how your business can better understand what your data reveals and how you can use those insights to drive business objectives.
The following diagram will help you visualize how the pillars relate to one another:
Figure 1.2 – The analytics maturity model
The first step in the process is to always understand the fundamental questions you are trying to answer. All analytical questions can be boiled down into the following categories:
What happened and when did it happen?Why and how did it happen?What will happen in the future?How can I make something happen?These categories will define the different areas of analytics involved, which will inform our decision about what tooling and techniques to apply.
Analytics can be split into four areas or pillars: descriptive, predictive, diagnostic, and prescriptive analytics.
You can think of the pillars by remembering what questions they try to answer:
Area of analytics
Question answered
Descriptive
What have we done in the past?
Diagnostic
Why have we seen past results?
Predictive
What will happen in the future?
Prescriptive
How should we act in order to achieve a future result?
Table 1.1 – How questions map to areas of analytics
We will look into each of the pillars in detail, starting our journey with descriptive analytics.
The first stage of analytics is descriptive analytics, which is the most common of all analytics activities today. Most management reporting—such as sales, marketing, operations, and finance—uses descriptive analytical tools and techniques. It tries to answer the “What happened?”, “How did it happen?”, and “When did it occur?” type of questions. It is most commonly associated with reporting or business intelligence work.
If you are in this category, you are attempting to describe groups, categories, and relationships. You are attempting to describe your data and frame it in the context of the domain knowledge. By domain knowledge, we mean a specific understanding of the context to which the data relates. Data does not exist in a vacuum. To understand the financial metrics of a company, you need to understand, to some degree, accounting and how the business operates. Likewise, to understand marketing data, you need to understand its context and operation. How does a digital acquisition campaign work, or how does a user convert through the acquisition funnel?
The descriptive nature of it makes it quite easy to start an analysis. You can essentially start your analysis or insight with an Excel file and the trustworthy PivotTable. For some tasks, that might be enough if the aim is simply an exploratory data analysis (EDA). As we will see further in this chapter, most likely, the EDA is just the beginning of the analysis, not the final insight you are being asked to give.
Descriptive analytics is, by its nature, the easiest type of analytics to implement and learn for the following reasons:
First, most likely, you already have the data in such a format that easily allows you, with some work and automation, to correctly group results in the desired dimensions or timeframes, making it an obvious place to start your analytical journey.Second, it has the lowest barrier to entry when it comes to skillset. To become competent at this stage of analytics, you need to know the core fundamentals of data extraction and aggregation, visualization and dashboarding best practices, and basic best practices in data engineering, such as the star schema.Third, it is intuitive for your stakeholders to get an understanding of what happened and when in a table or visual. Most of your stakeholders already make use of the same techniques to get an understanding of their area.Finally, it is very easily automatable and standardizable, which makes the process quick with the correct practice.Unfortunately, while it is the initial pillar that an analyst starts their journey with, it is also commonly where most stop in the analytics maturity model. The reasons are varied, but it can be argued that it boils down to company culture and the skillset of the teams running the analytical work:
In smaller companies, this type of analytical work is done by business intelligence teams whose members are experts at aggregating data in SQL and quickly providing dashboards for visualization. To move up the ladder, the skillset shifts more toward people with knowledge of statistics, econometrics, and some form of programming, usually Python or R.Even on teams who wish to move up the ladder, there is a maintenance cost to all the existing reports and visualizations that, without careful planning, can end up consuming all the time the team has at its disposal.A common shortcoming of this type of analysis is that, while it is essential to understand past trends, you will find a general lack of calls to action or actionable insights and inferences. Usually, when someone asks you what happened, what they are really asking is why it happened, which leads us to the next step, diagnostic analytics.
If in descriptive analytics you are answering what happened based on historical data and trends, in diagnostic analytics, you are attempting to get to the bottom of why. Be it an occurrence, a trend, or an anomaly, you are interested in finding out the key drivers or characteristics of the fact you are analyzing. In the same vein as descriptive analytics, you will make use of historical data.
Diagnostic analytics is where you, as an analyst, usually answer the “What went wrong?” or “Why did this happen?” type of questions. Here, you will frequently be asked, “Why did our cost per acquisition go up in the previous quarter?” or “Why are we seeing an increase in customer churn?”. Diagnostic analytics is the home of anomaly detection, outlier analysis, key driver analysis, or causal inference.
Curiously, this stage of analytics is often skipped. Either practitioners stop at descriptive analytics or they jump straight to predictive analytics. A subset will simply fold diagnostic analytics into predictive analytics. This bears some thought; you cannot, and should not, jump to predicting the future without understanding why past patterns or trends happened. If you cannot infer why your cost per acquisition jumped by 25 percent in the previous quarter, it will be a stretch to attempt to forecast what will happen to your cost per acquisition in the next two quarters.
The issue with this type of analytics is that most famous tools and techniques commonly associated with data science focus on forecasting accuracy and not on causal inference. And this is an important distinction; if you are trying to predict, as does Google, whether someone will click an ad, you don’t need to understand the causal mechanisms of the action. You just need to find the dimensions that highly correlate with that action, and if you have enough data, you will get good accuracy. We should emphasize here that when we are talking about enough data in this context, we are usually referring to data in the GBs or TBs range. Unless you, as an analyst, are dealing with clickstream data from digital marketing—that is, impression levels showing all interactions with your digital property by your users—you will not be in this range.
Unfortunately, as a marketing analyst, you most often will not have the luxury of big data. You will live in the world of small data, where you will have two years’ worth of daily data in the best scenario. It is also the field where you need a good amount of domain knowledge to understand what relationships you should model and what you should avoid.
Another common problem is the often-forgotten aphorism in statistics: correlation does not imply causation.
We will go through this point in greater detail in Chapter 4, Econometrics and Causal Inference with Statsmodels and PyMC, but suffice it to say that you will need to understand the causal mechanisms and the data-generating process to provide meaningful and accurate insights.
There are, however, tools and techniques to help you in this area, concentrated around the field of econometrics. Economists have spent the last 70 years developing an entire field devoted to asking the “why” of small data, and marketing analytics derives a lot of its techniques from this field. Economists, like marketing analysts, do not have high-frequency data. Also, like marketing analysts, economists are often tasked with using small data and deriving causal conclusions and policy recommendations with theoretically sound statistical techniques.
The next area is predictive analytics, which attempts to determine what is likely to happen. The aim is to provide forecasts or identify the likelihood of future outcomes.
Predictive analytics attempts to answer the question: “What will happen in the future?”. This is an entirely different type of question since you are predicting the future. You will combine the historical data and outputs from descriptive and diagnostic analytics, that is, the “what” and the “why,” to predict future events.
Common questions you will be asked are “Can we predict customer churn by customer satisfaction?” and “What will my cost per click be in paid searches over the next six months?”.
It is in this type of analytics that we need to be mindful of the common, and wrong, assumption that the past will repeat itself. As Mark Twain once famously said, “History doesn’t repeat itself, but it often rhymes.” You need good modeling practices and a correct workflow, and you should always test your hypothesis. For instance, trying to forecast how much you will spend on marketing and how much return you are going to get is a time-series problem that requires a very specific set of techniques and has a lot of pitfalls.
A common pitfall is focusing solely on the modeling side. While it is important to know which model to use and how to diagnose a bad model, data preparation is essential. A simple outlier in your dataset will throw all your results and efforts into the garbage bin.
Up to now, you have answered “what,” “why,” and “when” questions. Going one step further, you can ask the following question: what can I do now to achieve a specified result in the future?
This is where prescriptive analytics comes in.
Considered the final frontier of analytic capabilities, prescriptive analytics extends beyond predictive and diagnostic analytics by specifying both the actions necessary to achieve predicted outcomes and the interrelated effects of each decision. Here, you, as an analyst, are extending beyond predicting outcomes to suggesting actions and showing the implications of such actions.
For instance, in building a media mix model, you started with descriptive analytics by understanding what occurred, then you modeled the relationships between channels to understand how and why channels interact, and you fitted the model to forecast future sales. Now, armed with such a model, you can recommend a budget mix of marketing channels and activities that will result in the desired number of sales.
Prescriptive analytics not only anticipates what will happen and when it will happen but also why it will happen. Further, prescriptive analytics suggests options on how to take advantage of a future opportunity or mitigate a future risk and shows the implication of each decision. Prescriptive analytics can continually take in new data to re-predict and re-prescribe, thus automatically improving prediction accuracy and prescribing better decision options. Prescriptive analytics ingests hybrid data, a combination of structured (numbers, categories) and unstructured data (videos, images, sounds, texts), and business rules to predict what lies ahead and to advise how to take advantage of this predicted future without compromising other priorities.
In essence, it attempts to formalize and quantify educated guesses and domain knowledge in a systematic and repeatable way. There is, however, one large caveat in this field of analytics: you should not skip the ladder. You need the outputs from the previous pillars of descriptive, diagnostic, and predictive analytics.
From the theoretical aspects of analytics and its maturity model to the practical aspects of how to do an analysis, there is a gap. Fitting tools and techniques to questions is a fundamental skill that you, as an analyst, need to master. However, it is easy to get lost in the maze.
It is easy to be overwhelmed at this stage with all the tools, analysis, and techniques for evaluating business objectives and deriving insights. At this stage, we should always keep some first principles in the back of our minds as analysts.
First, you need to understand customer heterogeneity. Second, you need to understand the customer dynamics. Third, you need to understand that, in business, there are always trade-offs to be made with resources.
The principles map to a set of techniques, as seen in the following table:
Applicable techniques
Description
Cluster analysis for segmentation
Identifies groups of similar consumers
Discriminant analysis for targeting and classification
Identifies target customers using data easily available
Preference mapping for competitive positioning
Visual map of consumers’ preferences
Recency, frequency, and monetary analysis
Quantitatively separates and ranks groups of customers
Logistic regression models for customer selection
Estimates the effect of one or more predictor variables on a binary outcome
Customer lifetime value analysis
Calculates the value of individual customers or groups of customers
Survey design to derive customer insights
Using factor analysis to identify common factors and dimensions in survey data
Conjoint analysis for product and pricing decisions
Determines the value of different product attributes
Forecasting sales of new products
Predicts new sales and product acceptance
Media mix models to optimize marketing mix
Helps estimate outcome variables based on a mix of product, price, promotion, and medium
Marketing experiments to optimize marketing mix
Determines cause and effect versus correlation and change of output
Topic models to glean customer insights
Provides insight into unstructured text about desires, satisfaction, and emerging product need
Table 1.2 – Some techniques and their uses
Don’t think of the preceding table as all of marketing analytics. There are many more techniques and tools that can be applied. Think of it as a starting anchor to help you navigate the maze of what techniques you should use and when. This book aims to build your internal library of techniques such that, when faced with a question, you can think, “This problem is of that category, and one of the techniques I can apply is the following.” If this book succeeds at something, let it be to build your intuition and library of what to apply when and what to look for in your quest for insights. Paralysis by analysis is a reality.
Let us now move on to a large topic of debate, namely, why use Python, instead of simply doing all of what is described so far simply in a convenient Excel workbook.
You might wonder why we need a book on Marketing Analytics Using Python. Surely you can do the same thing using the trustworthy combination of Excel, some VLOOKUPs, and some PivotTables. This is a widespread misunderstanding, and the problem stems from not realizing what the entire analytical process should look like and why. The following diagram shows the process in a simplified way:
Figure 1.3 – Analytical process
As an analyst, you should have the preceding workflow that will generally go through the following tasks:
You should, first and foremost, scope out the question. You need to understand what is being asked of you clearly. Remember that your stakeholders have immense business knowledge and a problem they need to solve, but more often than not, the question might not be clearly defined.You must extract the correct datasets to explore the problem space. This might be as easy as extracting a CSV from Google Ads or as tricky as joining four different spend tables with a customer table in your database. Do not underestimate the time or effort required at this step. You will spend a large part of your time at this stage, but it will save you time in the next steps.After extracting the data, you need to clean the data and sense-check it for obvious problems. Live business data is messy, and you will be faced with non-obvious issues, such as a numerical column that contains a string such as N/A, bad Unicode strings in a survey dataset, or odd DateTime formats. Always remember the aphorism “garbage in, garbage out,” no matter how fancy your tech stack is.You have cleaned your data and sense-checked it, and now you must start understanding your data. This is where EDA comes in. At this stage, you will begin to group your data around categories or timeframes of interest. You are exploring the shape of the data, and it is at this step that you should be mindful of outliers. They will wreak havoc on most statistical and analytical models. You need to check quantiles, means and medians, and standard deviations at this stage. It is also at this stage that you start to see whether you have a suitable dataset for the question scoped out in step 1.You executed all of the previous steps, and we now get to the interesting part of analytics that gets all the credit—the model. If your question is just a matter of describing a dataset, such as the number of conversions on a specific acquisition channel, then EDA is all you need. But, most likely, there are underlying questions beneath, such as “How do the two campaigns compare?”, “What is the difference between these two segments of users?”, or “What characteristics do my best-converting users have so I can better target them?”. It is here that you will choose the proverbial “right tool for the job” and produce the insights you need.Finally, we reach the last stage, the delivery of the insights. You might think this is not a step in the process, but nothing will be actioned if your stakeholders don’t understand your results and insights. You should always keep in mind that your stakeholders are business experts first, not analysts. You need to translate the model outputs and the process you took to arrive at them in clear and concise language. Only then will they trust your insights and act on them.Excel is a very powerful tool that can do almost anything with enough knowledge and experience. Yes, with more or less difficulty, you can execute all of the preceding steps for all types of analytics in Excel. Then again, you can also do a linear regression in SQL directly in the database. The question is, should you?
Remember, your job as an analyst involves the production of meaningful insights that teams can quickly act upon in a reasonable amount of time with good precision. The operative phrase here is “a reasonable amount of time.” You need to choose the right tool for the job. Excel is a great tool for EDA and simple aggregations.
If you are dealing with a lot of data, you are going to start having issues with memory consumption. If your goal is more than simple statistical models or EDA, then Excel is not only cumbersome but also error-prone and hard to maintain and debug. We’ve all had at least one experience of spending an hour going through a gigantic Excel workbook, trying to spot the hidden cell that has the wrong input that is causing a weird calculation.
Finally, Excel has a terrible user experience when attempting to version control. We all know about the miracle of multiplying Excel files. As for running statistical modeling in Excel, it is true that Microsoft did great work in improving the accuracy of the algorithms it used. It is, however, still lacking when compared with languages such as Python or R. But why Python and not another statistical programming language? Let us delve a bit deeper into that.
Python offers a marketing analyst many benefits. First, it is an easy but powerful programming language with a great ecosystem of tools and libraries for data analysis and statistics. Second, as a programming language, it is easily testable, and the code can be made in such a way as to be generalizable and reusable. Do not underestimate this second point. Reusability is a great asset to have. You can reuse them for other datasets or testing purposes, which will massively increase your productivity in the medium to long term. Third, it handles massive amounts of data with modern libraries such as pandas and NumPy. The limit is essentially the physical memory in your machine.
Some of you might wonder, “Why not R?”. It is a matter of personal preference. Most marketing analytics was derived from the field of applied econometrics. R is one of the prime tools in econometrics and statistics. As a language, it was built for statisticians who did not want to learn how to program. It has an extensive and deep ecosystem of libraries and support tools. But Python has caught up in the last decade. Libraries such as pandas, statsmodels, scikit-learn, and many others make for an equally pleasant product experience as Python without any of the trade-offs vis-à-vis R. Also, in my personal experience—and your mileage may vary—Python has a nicer user experience than R when dealing with library management and code maintenance.
Although R and Python have most of the market share in the analytics space, there are other languages worth mentioning. First, you have the commercial ones, such as SAS, SPSS, TSP, or MATLAB. Although this is my personal opinion, I tend to shy away from commercial programming languages since they are niche and companies are moving away from some of them, such as SAS. SPSS and TSP are good econometric software tools, which I personally was taught about in college, but there isn’t anything you can do with them that you can’t do with Python or R. SAS and MATLAB are also curious cases. Although powerful, they tend to be expensive, so companies shy away from them unless they need them. Then comes Mojo, which is a curious case. Although it is a superset of Python, it is commercial in nature. It attempts to remove some of the speed bottlenecks of Python while maintaining the syntax structure. Currently, it is aimed at AI development, where the data requirements are huge and avoiding speed bottlenecks is critical. Finally, there’s Julia. Julia is as fast as C or C++ but syntactically as easy as Python. I believe it is a great addition to your toolbelt of languages as an analyst, although the libraries are less mature since Python and R carry a larger open source market share at the moment.
In this book, we will choose some libraries for each specific task or modeling effort. This does not mean they are the only options available. They are, in my opinion, an excellent place to start and are tried and tested. But for each suggestion of a library, there are always alternatives.
Aside from tools, libraries, and languages, as a marketing analyst, you will be concerned with getting good data to analyze. In a world of increased regulations on tracking and privacy concerns, you will have to navigate some challenges on the data collection side.
Marketers and marketing analysts have had the chance to swim in a world of data in the last 20 years. In fact, that was one of the main drivers of the spread of marketing analytics in the field, especially in the area of digital marketing, which accounts for almost two thirds of all marketing spending worldwide. However new trends in the attitude toward privacy and tracking online are making it harder for us, as analysts, to quickly derive insights from available data.
As we can see from Figure 1.4, for years the trend was clear. The largest proportion of marketing budgets went to online and digital marketing:
Figure 1.4 – Evolution of digital marketing spend as a percentage of global marketing spend
At our disposal, we had highly granular and vast datasets on our users, encompassing their behaviors, which ads they saw, when they saw them, and how they reacted to them. Clickstream data was a blessing, with its impression-level details. You could tie a user easily to a channel or activity.
The availability of such highly granular datasets made our jobs simpler. Getting insights was more a task of aggregation of individual-level data. Then, we could simply upload all the first- and third-party data we had to the media platform and run look-a-like ads to find similar potential customers. The main challenges were on the tracking front, making sure our users were correctly tagged. For years, we trusted that as long as our UTM parameters were correctly set and the cookies correctly dropped, we would get the visibility we needed.
Consumers are now more privacy-aware. Regulators are rolling out new regulations to protect user data, such as the General Data Protection Regulation (GDPR) in Europe and cookie consent. Platform vendors, such as smartphone and browser manufacturers (notably Apple), are rolling out new restrictions on third-party cookies and the unavailability of user-level data. The drop of third-party cookies by Apple, followed by first-party cookies, in what is deemed the “Cookiepocalypse,” that is set to happen in 2024, as well as the rollout of the restrictions on iOS 14.5, sent shockwaves into the world of digital marketing. This trend will not stop. Yes, some vendors will attempt more advanced tracking and fingerprinting techniques. Still, in the long run, it will be a game of cat and mouse where the consumer will have the final word. Marketing teams are losing the visibility they have become accustomed to. Some feel they are starting to fly blind.
This is where econometrics in general and marketing analytics come back to the fore. The link between econometrics and marketing analytics is an old and acknowledged one. Sir Martin Sorrell, the founder of the British advertising group WPP, described it as the holy grail back in 2005. Some of the techniques you will learn in this book go back 50 to 70 years to the early 20th century.
Having less data, we need to be smarter about how we analyze it. Brute forcing our way with massive datasets using black-box machine learning techniques is no longer possible. Taking the iOS 14.5 update as an example, you won’t know for sure what users came from paid media running on Apple devices, but with a media mix modeling approach, you can attempt to estimate the probable range of the effect of your efforts. And with some knowledge of econometrics, you can infer and attribute back, with some probability, which campaigns are proving the best value.
This does not mean we, as analysts, no longer have granular data. We will still have so-called “first-party data,” that is, data generated by the direct interactions of your customers with your business. Most of this data will be generated by digital touchpoints, which you can use to enrich your analysis efforts. However, the greater share of data available and the new technologies for a more privacy-friendly environment are proving their own challenges to analysts in areas such as data engineering and web tracking.
When moving past toy examples, data wrangling and transformation is neither easy nor something to be taken lightly. As described, since most digital marketing spending and interactions are of a digital nature, you are essentially swimming in a sea of data. Your job as an analyst is, as described