Deployment and validation woes
Had made a minor addition to the existing API to prevent duplicate payments - a point of trouble that came up a month back. Since we had to bear the cost of duplicate payments, decided we'll add a check to check bills for markers like date, biller, and such.
After a couple of iterations, made a final change in the morning - json cast data does remind me of the older days of working with embedded objects in Mongodb
.
Had setup an alert with Metabase to track any failed cases where the fix didn't work as intended. Three entries of this kind is low considering the 1500 odd bills paid today, but that's nonetheless, a painful rabbit hole.
Since the queries when run gave the intended results, was flummoxed to be right. Thankfully, have smart people to help with - team work does win. Sat with a colleague to track things down and a few good query changes, new repository tests, and replacing the select
with a select exists
.
And then the bulb started glowing when they asked the question of when deployment happened. We're on ECS and had a ~3x container count for about an hour post the CI's release workflow. Appears, there has been a mix of old and new deployments during the timeframe when the 3 failures happened.
Does raise a few questions in my mind about how this works and the need to take a bit more time and sit with - to understand these beyond what I do now.