3 Coding Mistakes to Unscalability

Every project starts small, ours did too. Before we knew it, it became a hit. I am talking about the Product Kits app of Shopify. You can read more about how we scaled to 100 to 100,000 hits in our Shopify App. However, there was a great learning experience and we realized how trivial things led to a big mess, but luckily everything was caught almost on first incident as we had proper logging.

Unfamiliar to Race Conditions

If you have a lot of traffic, the state of your system is unknown to two simultaneous operations unless you code that. This might cause either duplicate data and the fall-out of duplicate data or simply, operations are ignored. For example, Laravel’s Eloquent has firstOrCreate – it finds if there is a record with a specific condition, if not, it will create it and Shopify was sending the same webhook multiple times. Imagine the agony we had to face when we had duplicate data? If you have ‘group by’ in the query and then using ‘sort’ – ASC will have first record of duplicate, DESC will have last record of duplicate. Hence, an operation might be differ on each duplicate data – leading to a mess. This was happening because in between SELECT and INSERT, the SELECT of other webhook runs and finds no result. To avoid it, we used ‘locking’ SELECT – FOR UPDATE.

Unoptimized Migrations

If you are upgrading your app, you probably need migrations. Sometimes these migrations requiring you to operate on already present data in your database or – values are derived from other data in the table. For example, we were changing 1:N relations to N:M relations. Hence, a new table to be created to hold the relations and so on. In local environment, everything runs in less than a second right – you don’t have a 200GB data in local (usually) and in staging, you don’t have to keep the exact same data right? Now imagine what really an unoptimized code can do to 200GB of data? For starters, it will chock the RAM or already return an error if you are taking all the data in one go. If you are using iterations, it might take hours. We wrote a procedure to do it inside MySQL without the needing to take data and use PHP and write data back.

Choosing the Hammer

Just like not all languages work the same way, not all the frameworks are same. YII comes with the fantastic inbuilt cache with invalidation rules, but most used framework Laravel is missing it. PHP doesn’t have threads and bigint, Ruby, Golang does. So just don’t pick the Hammer, have a toolbox instead. Use docker for scalability combining all of your tools.

Conclusion

Unfortunately, a lot of things we take as granted when we write the first line of code. Worst thing that can happen to you is you become successful and can’t handle it well. I hope that these mistakes will not be made in your next project. I wish you good luck!

Writing the ‘straight’ codes

The most prominent mistake which is made during coding is calling with nested functions. For instance look at the following codes:

//bad codes
function getBooks(){
 return getAuhtors(books);
}
function getAuthors(books){
  books.authors = SomeQuery;
}
function main(){
 let booksWithAuthor = getBooks();
}

Problem with the above code is, they are nested and when you call getBooks() – you are not aware that it will bring authors as well. Let us try again by renaming the function.

//bad codes
function getBooksWithAuthors(){
 return getAuthors(books);
}
function getAuthors(books = []){
  books.authors = SomeQuery;
  return books
}
function main(){
 let booksWithAuthor = getBooksWithAuthors();
}

After the change, there is not any longer function which only takes the Books i.e. Books without authors and it is still sort of nested right? Lets further modify the codes and turn them into sequential

// Good Codes
function getBooks(){
 return books;
}
function getAuthors(books = []){
 return authors;
}
function getBooksWithAuthors(){
 books = getBooks();
 books.authors = getAuthors(books);
}

The function calls in getBooksWithAuthors now do not have nested function calls but a sequential call which combines both of the data i.e. books and authors

Advantages

  1. Codes are more readable – you know that getBooksWithAuthors will get you both.
  2. Codes will lead to isolated functions – getBooks and getAuthors are isolated functions and can be called by any other function.
  3. Codes don’t have side-effects – Here sequential call ensures that it is called where it is intended. For instance, if you want to get only books, you would call getBooks()
  4. Improve Unit Test coverage – There will be more coverage of tests as they are all isolated functions.

How to write good “If” statements?

Writing a good code is an art and it can be useful in so trivial things that most developers often ignore. Usually, amateurs are programmed to write an ‘if’ statement which is always accompanied by an ‘else’ . However, it is important to realize that code quality can significantly be improved if else is omitted. For instance, consider this code.


function getUser($id){
  if($id < 1){
   $user = [];
  }else{
   //some big codes
  }
  return $user
}

Here some big codes can be so big that it will return statement will be lost in it. Better would be:


function getUser($id){
  if($id < 1){
   return [];
  }
  //some big codes
  return $user
}