The Strong Data Science Audit: How does your Organization's Data Strategy Stack Up?

Dec 13th, 2016 by Brock

Data science is a new and growing field. As such, most organizations are still figuring out how to use these new tools and strategies to unlock new opportunities. We want to help! Audit your current data science strategy below to find weaknesses and opportunities for you to pursue and, in turn, leverage the power of data science in your organization.

  • Analytics & Dashboards

    • Do you track users' activity on your website, app, or other products using a tool such as Google Analytics, Segment, MixPanel, or Heap?
      Tracking basic user activity is critical to decision making and optimizing user's experience on your website. You can choose to use a basic, free tool like Google Analytics (though see our blog post — coming soon — on how to perform essential data science upgrades) or, if you have the budget, purchase a tool like Segment or Heap. We can help set up with existing solution or even a custom solution tailored to your company.
    • Are you tracking explicitly-defined user events, or just pageviews?
      Tracking pageviews is quick way to see what people are doing, but they do not scale up. If you ever re-configure your website or app, future data projects are going to be hampered by trying to transform pageview contexts into clean, well-defined events. We suggest defining and monitoring critical customer events (e.g., registration, purchasing, onboarding milestones) to get the most out of your analytics.
    • Are you able to view an individual user's event history, or just aggregate metrics?
      Aggregate statistics provide a "bird's eye view" of your data and are sufficient for broad assessment of marketing channels. Nevertheless, the most fruitful data strategies are going to require access to individual user data — tracing a particular user through time. This is key for developing predictive customer models and optimization.
    • Are you able to track users across each of your websites or apps?
      If you have multiple websites, products, or apps, it can often be helpful to track an individual customer across all of your products (even if they are just browsing, logged-out). This opens up the possibility to optimize all of your properties holistically, and not view each in a silo.
    • Have you defined Key Performance Indicators (KPIs)?
      Transforming your data into KPIs — clear, meaningful metrics that relate to your company's strategic goals — is essential to making data "actionable." Many companies take pride in generating and storing lots of their data, then dumping it on anyone asks. But the best companies from a data science perspective know how to simplify their data into things like KPIs.
    • Are you able to track your KPIs easily every day, through a tool such as a dashboard or automated email?
      What's the use of having all these data if you cannot access it? Data science is all about communication. Storing your data in a database behind a lock-and-key, or simply spitting it out into tables is not good communication. A good dashboard will transform the way you monitor and manage your organization. Here there are lots of possibilities, including open-source solutions like Metabase or Superset and premium solutions like Periscope or Chartio.
    • Can you add new graphs or metrics to your dashboard using SQL?
      SQL is the language of data science — a universal way for data scientists to access your data. The best dashboards can "scale up" with your company by allowing data scientists to create new visualizations using SQL. That way, when you launch a new feature, you can tailor your data visualizations nearly immediately without adding on new tech overhead or expense.
    • Can managers and other key decision makers customize their dashboards and other data views?
      For small companies, sharing one dashboard is often okay. But as you grow, it rapidly becomes clear that different stakeholders need different views into your data, whether it's because they need to make different kinds of decisions or simply have different preferences. In any case, it's important that the data work for everyone, and flexible, customizable dashboards go a long way here.
  • Data Pipeline & Warehousing

    • Could someone query your data using SQL?
      Some very popular tools for "data-driven insights" often hoard your data, offering you only fancy visualizations and summaries. These can be fun and are surely better than nothing at all. But you'll need you direct access to the underlying data if you are going to build models and reap the rewards of integrating machine learning into your organization.
    • Are all your data stored together (i.e., in a warehouse), or in separate locations/services?
      Often the most powerful insights come from integrating your data across all of its sources. Integrating data like this is much easier to do when the data are stored together in a common location (e.g., an AWS S3 bucket or Redshift cluster), rather than spread out across your various service providers. Storing your data together also keeps you from having to worry about what happens if one of your providers closes up shop.
    • Do you store all user data (even if you aren't using it today)? Or is your data storing process "lossy"?
      When you are just starting out, it often feels like overkill to worry about storing all your data, so you focus only on the data you view as critical today. But who knows what you'll want tomorrow? While we argue that small datasets are more useful than people often think, there's no denying that more data is better — and you cannot get back what you never stored. That's why we wrote as part of our guide for new startups that they need to store as much as possible right from the start.
    • Do you have a reliable backup and recovery process for your data?
      Beyond simply backing up your data, a proper data strategy must ensure that it's easy to restore your data, and that the whole system has been proven reliable.
    • Is your data warehouse built on a scalable, cloud architecture (e.g., AWS)?
      Data storage needs within growing organizations increase exponentially and, if you aren't prepared, are sure to cause serious pains at some point. The problem isn't that you simply need to upgrade your storage, but that your entire data pipeline (from product to warehouse to analysis) might need to be modified. Do not let your data strategy create more "technical debt" for your company — build your data pipeline and warehouse on a scaleable, cloud architecture like AWS. We've helped numerous clients deploy and integrate AWS services to meet their organization's needs.
    • Do you have someone who "owns" and monitors the health of your data warehouse and data pipeline?
      If you do not have a data expert on your team who monitors the health of your data warehouse, you can always bring on an external team to monitor it for you.
    • Would you be automatically notified if there was a failure in pushing new data to your warehouse?
      Data warehouses are often "passive" in the sense that people throw data at them and hope it sticks. We build data warehouses that actively self-monitor and alert you if they have a problem, otherwise your data strategy may be silently failing without you even knowing.
    • Could your data warehousing strategy survive with 100x more customers than you have today?
      This is a simple though sometimes scary question for our clients, but it gets to an important point: Do not build for your organization' needs today, build for the organization you will — or want to! — become. Just like we laugh at the fact that the earliest computers and less storage than today's smart watches, you'll someday laugh (or cry?) at your data strategy if you do not plan ahead for growth.
  • Machine Learning, Prediction, & AI

    • Are you using your data to learn about the factors (e.g., of users' demographics or their experience) that predict key outcomes (e.g., purchasing, donating, churning)?
      Your data should unlock new, actionable insights into your organization and its customers through statistical modeling. Broadly speaking, this means learning what causes what. Depending on your organization, this could mean asking how a user's first day on your app relates to their likelihood of purchasing, how their demographics increase/decrease churn rate, or what marketing techniques lead to optimal conversion. The models that you choose to implement should reflect your priorities; for example, you may begin with a simpler model when you need some quick insights and decisions, while more mature companies will favour more complex models that are offer finer-grained analysis more suited to long-term strategizing. Here at Strong, we are all PhD-trained scientists and experts in using statistical modeling to unearth clear, actionable insights.
    • Are you using machine learning in your products or management tools?
      Implementing "machine learning" and "artificial intelligence" can sound like the exclusive domain of Fortune 500 companies and other behemoths. However, recent advances in these technologies have made it possible (and practical) for companies of all sizes to optimize their UX, marketing, and management using machine learning and AI. Here at Strong, we often work with microcap and smallcap businesses to strategize, build, and deploy machine learning-powered technologies. Not only do we consistently see great evidence that these technologies can work for smaller organizations, we think they can essential to turning smaller organizations into much bigger ones!
    • Are you predicting users' behaviors or other outcomes with live, predictive models?
      Building predictive models and integrating them into your products and management tools opens a world of opportunity for any organization. A number of existing tools, such as Domino and yhat can be used for integrating live predictive models. We've worked with a number of organizations to consult about these existing solutions or, in some cases, build a custom solution tailored exactly to your organization's needs.
    • Are you tailoring users' experiences on your website based on outcomes of predictive models?
      Optimize your customers' experiences in real time by using live, predictive models and machine learning. When done right, your customers are happy because they get what they want, and you're happy because your customers stay engaged longer, purchase more, and think more highly of your brand.
    • If you run an online store or community, are you providing personalized recommendations of things that are likely to keep users engaged?
      Providing personalized recommendations can engage your users and help them discover further products and resources that you offer. We can help you implement custom solutions or off-the-shelf tools to provide recommendations based on your users' behavior
  • Experimentation & Optimization

    • Are you running experiments?
      Gut "instincts" only get you so far as a decision-maker. Indeed, the clearest way to learn anything about your company is to run experiments. Want to know how to optimize your landing page? Run an experiment. Want to optimize your product onboarding process? Run an experiment. You'll be able to make swifter, more confident decisions and have the data to back them up. Experimentation is a hallmark of being a modern, data-driven organization.
    • Are you optimizing your marketing and communications with experiments?
      Building effective marketing and communication campaigns can be a time drain, and it's difficult to know that you've made the right decisions. Standard A/B tests are static, and do not address the true complexity of dynamic customer interactions. We've spent a lot of time helping our customers with this, which led us to build Optimail - a platform for email marketing that uses artificial intelligence to automatically learn what and when to message your customers to drive them towards your business goals.
    • Are experiments evaluated in terms of well-defined KPIs?
      Before running an experiment, it's critical that you decide how it will be evaluated. Too often we find that organizations test a new strategy or concept with the idea that they want to know if it's "better." The problem with this is that "better" isn't always clear: metrics sometimes do not tell the same story and there are often trade-offs between them. We suggest that our clients pick KPIs of interest before any experimenting. This not only makes experiments easier to interpret, but also keeps experimenters from simple finding the metrics that tell the story they want to be true.
    • Are experimental manipulations compared against randomized controls?
      Without randomized controls, your experiment simply isn't an experiment; it's just a change that you measured and correlated with some outcome and, as they say, correlation isn't causation. It's therefore imperative for interpreting experiments as clear, causal evidence that you maintain a control condition populated randomly with control subjects. If you aren't sure how to implement this yourself, reach out and talk to us about research experimentation and design.
    • Do you have a way to automatically deploy and monitor ongoing experiments?
      Creating a culture of experimentation and data-driven decision making can only truly happy when experiments aren't seen as technical burdens, but as easy, effective ways to address your most important questions. Several services, such as Optimizely, have sprung up to help organizations deploy and monitor experiments.
    • Do you evaluate experimental outcomes using appropriate statistical tests (as opposed to eyeballing quantitative outcomes)?
      Using appropriate statistical tests to evaluate the credibility of a given result (i.e., how likely it is to reflect the true state of the world) is critical once you enter the world of experimentation. Not only do you want to know how likely it is that you are right/wrong in accepting a given outcome as truth, you want to know the potential upside and downside (in real dollar costs) of taking a given action based on a decision. Sometimes results that seem strong based on pure metrics are much weaker than they appear, but using appropriate, valid statistical methods (which we can help you with) can mitigate this risk.
    • Do you consult with trained statisticians or experienced data scientists before beginning an experiment, to assure that you are protecting its statistical and causal integrity?
      A typical mistake that we see people (especially talented developers) make is thinking that, because they know how to run an experiment and gather statistics from a technical perspective, that they also know how from a scientific perspective. The truth is that designing and interpreting experiments isn't easy. We've worked with billion-dollar companies that consistently produce biased results with inflated error rates, leading them to making incorrect and costly decisions on a regular basis. If you're going to take the time to run experiments, we suggest that you work with a data science company with scientific training and expertise (like us at Strong Analytics) to ensure that these experiments produce the most valuable and actionable results possible.
    • Do you have a platform for storing and sharing knowledge from research and experiments?
      One common problem with data science is that it can get a little messy. Experiments and research are often done in a haphazard manner that can be good for creativity and insights, but far from ideal when it comes to communication and archival of results. All good research is worth documenting and storing, and we recommend Airbnb's open source knowledge repo for companies who need a solution today. For others, often a custom solution that works for data scientists and their business partners is the way to go.


Save/Print Your Results

Audit Score


Want help with your data science strategy? We work with companies of all sizes. Contact us and we'd be happy to discuss your options.



That's it! We hope this audit has given you a better sense of the possibilities that modern data science tools and strategies create for your organization. If you are interested in exploring these possibilities further, please contact us and we would be happy to setup a call.