Every project starts small, ours did too. Before we knew it, it became a hit. I am talking about the Product Kits app of Shopify. You can read more about how we scaled to 100 to 100,000 hits in our Shopify App. However, there was a great learning experience and we realized how trivial things led to a big mess, but luckily everything was caught almost on first incident as we had proper logging.
Unfamiliar to Race Conditions
If you have a lot of traffic, the state of your system is unknown to two simultaneous operations unless you code that. This might cause either duplicate data and the fall-out of duplicate data or simply, operations are ignored. For example, Laravel’s Eloquent has firstOrCreate – it finds if there is a record with a specific condition, if not, it will create it and Shopify was sending the same webhook multiple times. Imagine the agony we had to face when we had duplicate data? If you have ‘group by’ in the query and then using ‘sort’ – ASC will have first record of duplicate, DESC will have last record of duplicate. Hence, an operation might be differ on each duplicate data – leading to a mess. This was happening because in between SELECT and INSERT, the SELECT of other webhook runs and finds no result. To avoid it, we used ‘locking’ SELECT – FOR UPDATE.
If you are upgrading your app, you probably need migrations. Sometimes these migrations requiring you to operate on already present data in your database or – values are derived from other data in the table. For example, we were changing 1:N relations to N:M relations. Hence, a new table to be created to hold the relations and so on. In local environment, everything runs in less than a second right – you don’t have a 200GB data in local (usually) and in staging, you don’t have to keep the exact same data right? Now imagine what really an unoptimized code can do to 200GB of data? For starters, it will chock the RAM or already return an error if you are taking all the data in one go. If you are using iterations, it might take hours. We wrote a procedure to do it inside MySQL without the needing to take data and use PHP and write data back.
Choosing the Hammer
Just like not all languages work the same way, not all the frameworks are same. YII comes with the fantastic inbuilt cache with invalidation rules, but most used framework Laravel is missing it. PHP doesn’t have threads and bigint, Ruby, Golang does. So just don’t pick the Hammer, have a toolbox instead. Use docker for scalability combining all of your tools.
Unfortunately, a lot of things we take as granted when we write the first line of code. Worst thing that can happen to you is you become successful and can’t handle it well. I hope that these mistakes will not be made in your next project. I wish you good luck!