Background Just after the middle of last year, Target expanded beyond its on-prem infrastructure and began deploying portions of target.com to the cloud. The deployment platform was homegrown (codename Houston), and was backed wholly by our public cloud provider. While in some aspects that platform was on par with other prominent continuous deployment offerings, the actual method of deploying code was cumbersome and not adherent to cloud best practices. These shortcomings led to a brief internal evaluation of various CI/CD platforms, which in turn led us to Spinnaker. We chose Spinnaker because it integrates with CI tools we already use...
Distributed Troubleshooting
Target’s open source big data platform contains a vast array of clustered technologies or ecosystems working together. Troubleshooting an issue within a single ecosystem is a difficult task let alone an issue that spans several ecosystems. It is impractical for a single human to individually investigate ecosystems one at a time for potential problems. The house will burn to the ground long before an engineer can find the cause of an issue and resolve it without quick access to aggregated system metrics and logs. The Solution How to identify, troubleshoot and resolve a distributed issue? Fight fire with fire of...
Win the cloud with Winnaker!
Win the cloud with Winnaker! I am happy to announce that we, at Target, decided to open source a tool called Winnaker. This tool will allow the user to audit Spinnaker from an end user point of view. But first what is Spinnaker? The first time I heard the word Spinnaker, my reaction was, “wait, what does that even mean in English?” Shortly after, I found myself implementing a demo of Spinnaker as a potential replacement for our internal cloud deployment tool. Spinnaker is a cloud agnostic continuous delivery tool, which means we can push our code to any cloud...
How Target Performance Tunes Machine Learning Applications
At Target we aim to make shopping more fun and relevant for our guests through extensive use of data – and believe me, we have lots of data! Tens of millions of guests and hundreds of thousands of items lead to billions of transactions and interactions. We regularly employ a number of different machine learning techniques on such large datasets for dozens of algorithms. We are constantly looking for ways to improve speed and relevance of our algorithms and one such quest brought us to carefully evaluate matrix multiplications at scale – since that forms the bedrock for most algorithms....
(Data) Science or Witchcraft?
On my first encounter with it, around early 2010’s, I was mystified. It sounded like witchcraft and I imagined the practitioners to be a coven of witches and wizards, all holding Ph.D.s in the dark art of “Data Science” and being respectfully addressed as “Data Scientists”. It was believed they would magically transform haystacks into gold and then ask for your first-born in return as a reward for their service (a la Rumpelstiltskin) There is no denying the fact that the title “Data Scientist” is the most coveted one these days and has a nice ring to it. It’s also...
Bare Metal Big Data Builds
When you first think about scaling an on-premise Hadoop cluster your mind jumps to the process and the teams involved in building the servers, the time needed for configuring them and then the stability required while getting them into the cluster. Here at Target that process used to be measured in months. The story below outlines our journey around scaling our Hadoop cluster, taking the months to hours and adding hundreds of servers in a couple weeks.