For one of LakeDrops enterprise customers we've developed a fairly complex set of intranets with Drupal 9, where one of the sites is a content hub with tens of thousands of nodes and taxonomy terms that get maintained, reviewed and published on that hub. All other sites require those entities in read-only mode as well, and we've chosen the great module Entity Share to easily synchronize all entities from the hub to all other sites.
While this is working really great, we hit a big wall when rolling this out into production. The synchronization process kicked off successfully, only to fail badly after a couple of seconds.
Entity share uses Drupal's JSON-API to exchange all the information between the content hub and the clients, who want to receive the content entities. Therefore, it uses http requests to receive entity lists and afterwards the entities themselves, of course only those that haven't been synchronized yet or have been updated since the last synchronization. For such a large content hub, this results in a huge number of http requests from a single source to always the same destination. That said, after a few hundred 200 responses as expected, we suddenly only received 503 error codes - and no entities any longer.
The Drupal error log of the content hub contained no indication of what was going on. Those requests for which we received 503 responses didn't even get to that Drupal site at all. They must have been rejected somewhere upstream in the IT stack of that international enterprise, which is completely unknown to us, the Drupal development team.
Opening issues in Jira, a number of phone calls, plenty of emails back and forth, and several hours later we've had more questions unanswered than at the beginning of that process. Whereas, the explanation seems obvious: there is either a request limit or a WAF (web application firewall) involved, either of which got triggered by the request pattern that we submitted. Just finding out what exactly it is, who is in charge and which form we should fill out for the change request caused more headache than hope for a resolution. To make things worse, there have been several departments in various countries waiting for the result, as they booked their own time for the review of our setup. A quick fix was needed.
As it happened, all involved Drupal instances - the content hub and all clients reading from it - are located in the same LAN and can communicate to each other. So, why not requesting the content internally, i.e. from a client Drupal site to the content hub, without using the official domain and route through the full IT stack? Because that requires the content hub to run its own web server to receive http requests and respond to them. Asking IT to set up one for us, would have been equally difficult to explain, if not impossible.
That's where Drupal's CLI tool drush came to help. For whatever reason, drush has a built-in web server implemented in PHP, which of course was available in the existing infrastructure. So, we decided to give it a try, still being sceptical if it could handle the huge load from all the clients for all those entities:
$ > drush runserver 8080 [notice] HTTP server listening on 127.0.0.1, port 8080 (see http://127.0.0.1:8080/), serving site, sites/default [Fri Jul 15 14:22:52 2022] PHP 7.4.30 Development Server (http://127.0.0.1:8080) started
That's all, no further configuration or setup required. This uses the existing Drupal site settings, accesses the database just like with a LAMP stack and best of all, it doesn't interfere with the existing infrastructure which is still served in parallel, while Drush's built-in web server responds to the internal requests from the entity share clients. They only needed to be re-configured to talk to the internal IP-address of the content hub instead of its public domain, and that was it.
But wait, this crashed after 4 hours! Why? Because the web server contained a hard-coded timeout of 14400 seconds. Well, no problem, creating a pull request in Drush's GitHub repository to remove that timeout, and only 5 hours later Moshe Weitzman accepted it and merged the change into Drush.
Problem solved, customer is happy, deadlines haven't been missed, and Drupal's strength has been demonstrated again. Thanks!