Great questions.

2 min readJan 12, 2022

Great questions.

1. Why don't you directly store the failing message to the database? So no need to use retry_topic. It seems that the retry_topic is only being used as a "bridge" because of separate microservice?

(I’m assuming that you’re not having the client service and retry microservice share the same database)

Yes, it’s just a way to send this data over to the dedicated retry microservice. If you are implementing this retry mechanism as part of your service itself, then yes, storing the failing message to the database would work.

But keep in mind that if you’ll have to do the exact same thing (new database table, periodic querying of the table, etc) if you want to implement the same mechanism on another service, leading to duplicate code in your code base. This is better avoided by just using the dedicated microservice, and piping the failed messages over to it via the retry_topic “bridge”.

2. Why do you need to use several retry_topic? Such as retry_5m_topic, etc. Why not using only one retry_topic?

We designed it this way to avoid additional fields in the retry_wrapper proto. By publishing to the dedicated retry_Xm_topic, we know to delay the retry by X time. Also, we guarantee that the messages that come in are served in a first-come-first-served order more consistently.

If we don’t do this, there are other alternatives if we want to use a single retry_topic:

a. Include a “delay_duration” field in the retry_wrapper proto, so we can just add that duration to the current time to know when to attempt the next retry

b. Client service calculates and Includes a “retry_at” field in the retry_wrapper proto, so we know when to attempt the retry

Using either a or b would mean a less predictable sequence of retries.

Imagine this circumstance:

- two retry messages (X and Y) are published concurrently, with X having an earlier retry_at.

- however due to some reasons, Y goes into the retry_topic earlier than X.

Although X has an earlier retry_at time, Y is ahead of it when querying the database. This might result in Y getting retried before X.

Of course, you can introduce an ORDER_BY retry_at when querying the table, but this introduces a little performance hit in the query.

All the approaches come with their caveats. You might want to choose the approach that suits your project best!

Written by Baron Chan

No responses yet