How Ninja Van uses a dedicated micro-service to facilitate retrying of Kafka messages at scale.
Failures from consuming Kafka messages are inevitable. Fortunately, Kafka handles most causes like network failures, uncommitted messages with its fault-tolerant architecture.
Unfortunately, errors can happen on the application side — things like database constraints being violated or dirty data could cause a message consumption to fail. These errors are terminal, and we should not bother retrying them as they are doomed to fail again.
Some errors are “fixable”. For example, a temporary outage of a dependency, or concurrency issues, can cause an error. …
Building a bamboo plan to restore archived logs on demand
At Ninja Van, we use the Elasticsearch/Fluentd/Kubernetes (EFK) for our logging purposes. If you wish to know more about how we implemented it, you might want to read about it here.
Delivering more than 2 million parcels per day at peak, we experience immense traffic volumes and logs from our different micro-services. At the moment, we have more than 5TB worth of logs per day. …
“Want to join a competition and make a game? It’s a $1 million dollars prize pool!” — asked Timothi when we met up when he came back to Singapore for a visit during his term break in March 2019.
Without giving much thought, I agreed. What I did not expect was a wild ride over the subsequent seven months, where we had the privilege of being one of the few people in the world to develop a game using Niantic’s proprietary technologies.
Within a span of a week, we had roped in three more people into the group, bringing our…
Software Engineer (Backend) in Singapore