bol.com, the Netherlands largest online retailer, has been a client of Sourcelabs for many years. Recently we had the opportunity to co-organized a tech day for one of the product areas within bol.com.
The primary goal of the day was to have fun and give the teams within our product area some time to learn or experiment with new technologies. This is somewhat different than a ‘Hackaton’ as there was no requirement to deliver functioning software by the end of the day. We just wanted to have some fun with fellow colleagues and possibly learn something while doing so.
What wikipedia has to say about a hackathon:
A hackathon (also known as a hack day, hackfest or codefest; a portmanteau of hacking marathon) is a design sprint-like event; often, in which computer programmers and others involved in software development, including graphic designers, interface designers, project managers, domain experts, and others collaborate intensively on software projects.
The goal of a hackathon is to create functioning software or hardware by the end of the event. Hackathons tend to have a specific focus, which can include the programming language used, the operating system, an application, an API, or the subject and the demographic group of the programmers. In other cases, there is no restriction on the type of software being created.
We noticed that especially with the COVID-19 situation most of the teams are really focussed on delivering features, so we felt this would be a (much needed and) nice opportunity to do something completely different and get out of our comfort-zone while experimenting with interesting new technology. But because of COVID-19 we had to do this all remotely.
What we did as a preparation for the tech day was to compile a shortlist of interesting topics and technologies to work on. All attendees had to decide what topic they would like to participate on.
I decided to participate on a Prometheus workshop, hosted by bol.com colleague and Prometheus expert Mickaël Carl, and set a personal goal to have our applications metrics being exported to Prometheus by the end of that day.
Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Prometheus’s main features include a multi-dimensional data model with time series data identified by metric name and key/value pairs and PromQL, a flexible query language to leverage this dimensionality. There is no reliance on distributed storage; single server nodes are autonomous. Time series collection happens via a pull model over HTTP.
Especially the pull based model has been really valuable since our statsd + graphite based metrics system did not scale very well due to clients overwhelming the servers with metrics.
After setting up our Spring Boot application with an endpoint exposing all the Prometheus metric data, we could get started setting up our Prometheus cluster in GCP (Google Cloud Platform). Most of our time was actually spent on troubleshoot firewall connectivity issue but we did manage to get everything working that day.
Right after our tech day we fine-tuned our Prometheus setup and created a few very nice Kibana dashboards based on these new metrics. Next step for us would be define alerting rules for the Prometheus alertmanager, configuring Iris and setting up OnCall.
All in all it was a very productive day where we got to play around setting up some infrastructure for monitoring and alerting for our application. We actually managed to deliver something while this was not even the goal for that day. More importantly I learned a lot a new things, met a few new (very knowledgable) people and had a lot of fun while doing so!