This is part 2 of my 3 part exploration of the following question: are Agile engineering practices applicable to analytics, or is the nature of research prohibitively distinct from the nature of engineering? For the agile fans, in part 1 I gave an intro to agile and talked through what I like about the scrum development process for analytics. For the agile nay-sayers, in this post I explore the elements of agile that do not work particularly well with Analytics (issues range from annoyance to downright incompatibility).
The tools and techniques of data science and advanced analytics can be used to solve many problems. In some cases – self-driving cars, face recognition, machine translation – those technologies make tasks possible that previously were impossible to automate. That is an amazing, transformative accomplishment. But I want to sing a paean to a mundane but important aspect of data science – the ability to intelligently put lists of things in a better order. For many organizations, once you have found some insights, and are into the realm of putting data products into production, the most substantial value can be found by identifying inefficient processes and making them efficient. Twenty or thirty years ago, that efficiency-gain might have been addressed by converting a paper-based process to a computer-based process. But now, prioritization – putting things in the right order – can be what it takes to make an impact.
A lot of Data Science literature focuses on the mathematical and algorithmic aspects of model building. This is after all what much of academia spends time grappling with. However the Data practitioner understands that applying Data Science and Machine Learning models to solve real-world problems involves much more than coding a statistical formula. Today’s post will be about keeping your models fresh and up to date, and your team informed as your data world evolves. We will discuss some implications of a changing data distribution on your model, practical technical considerations when building a model that is integrated with the product application, and how presenting to your team can be a great checkpoint on your model building progress.
Yes, if you want to build a truly data-driven organization your data warehouse needs a Service Level Agreement (SLA). At the core of any data driven organization is trust - your stakeholders must trust that when they need data, it will be there and it will be accurate. Without trust in the data warehouse, your organization will be less likely to use data to drive decisions big and small. In my previous post Reporting is a Gateway Drug I explored reporting as a tool to build a trusting stakeholder relationship. In this post I explore trust through the concept of a data warehouse SLA. In part two I explore the people, process and tools you need to successfully implement the SLA.
Agile software engineering practices have become the standard work management tool for modern software development teams. Are these techniques applicable to analytics, or is the nature of research prohibitively distinct from the nature of engineering? In this post I am going to explore some of the pros of using a scrum-like work management process in analytics.
Many Data Scientists come from a hard science background - statistics, math, physics. Hard sciences have a bias towards empirical and objective truths: a correct answer exists and we can find it by employing the scientific method, usually manifested by a formulaic approach to solving the problem at hand. While not a controversial statement in itself, many years of studying and application of such a paradigm can collide with the practical realities of the business world. In that world, it becomes increasingly difficult to perfectly apply the theory. As a result, the practitioner should understand how to adjust their model and their approach accordingly.
Data warehouses are not just for business intelligence (BI) anymore. You can maximize the value of your data engineering, data science, and analytics work by investing in building out a multi-use data-platform that serves business users, Analysts, Statisticians, and intelligent applications. In my last post, data-dies-in-darkness, I described how you can improve your organization’s data quality by exposing more data to more people. You can stretch this idea even farther by expanding the stakeholders of your data warehouse to include intelligent applications.
I love doing reporting. Well I don’t actually love doing reporting, but I love what it can do for an Analytics team. If executed well, reporting can be the gateway drug, resulting in an organization that is completely addicted to its Analytics team. If executed poorly, the Analytics team can turn into a team of reporting monkeys - we all know what that is like. Here is some advice on how to use reporting as a means to create strong stakeholder relationships in your organization.
The fastest way to doom an Analytics team (and any hope of building a data-driven organization) is to present data and analyses that are often flawed or inconsistent. When people don’t believe they can trust the data, they will stop using them (and, if you are an analytics leader, you might be soon looking for a new job).
Attribution is a tough challenge that is top of mind for every Marketing and Analytics leader. While marketing strategies and technologies may have evolved, the most important question has not changed - Is our marketing working? The right attribution solution should help you answer that question. But how do you find the right solution? Unfortunately, there is no turnkey attribution solution that perfectly solves all of your measurement challenges. Each business has unique attribution challenges and there are a seemingly infinite number of vendors and methodologies. As a result, I created a framework to navigate the increasingly complex multi-touch attribution market, understand the trade offs between solutions and identify the optimal attribution solution.