Skip to main content

Is PHP a good fit for an API server?

Image: Pixabay
Calling PHP a double-claw hammer is a bit of an in-joke in the PHP community.  A lot of people bemoan PHP as a language – it's fashionable to do so and it seems to be a way to look clever.  The joke came about from a blog post where somebody pointed out all of the problems with PHP (here's a rebuttal - https://blog.codinghorror.com/the-php-singularity/ )

Anyway, PHP is a warty language that sucks in academic circles but it doesn't matter because it's really good at web stuff, there are lots of people who know it (so it's cheap to hire), there are lots of libraries and frameworks (so it's cheap and fast to develop in).  The commercial world is willing to overlook the academic warts.

I'm busy helping to improve the performance of an API server.  As part of my effort I'm profiling calls to the endpoints.  I'm using Blackfire to generate call graphs and also logging the sql queries that the ORM is producing so that I can check the indexes and joins.

Here's a callgraph for a call to the endpoint where we are looking to run a paginated SQL query.  We're not applying any business logic or having any side-effects - all we're trying to do is query the database and return a JSON string to the frontend.


Blackfire call graph
That's a pretty substantial call graph for what sounds like a simple task right?  All I want to do is route the request to a controller, query the database, and send the results back.

Blackfire tells me that 172 different MySQL queries are being run.  The PHP code responsible is using the ORM to build up the joins and so on.  I suspect that the problem is that there is pagination being applied and the ORM is not able to optimize the queries it needs to do in order to paginate efficiently.

Okay, so what questions do I have?

Why are we not querying the database more directly?  I appreciate that developer productivity is a good reason to use ORM but is it a good reason in this case?  172 queries is an awful lot, especially when a lot of them are related to querying the schema so our ORM can run.

Why on earth does PHP have to spend so much time in disk I/O reading all of those source files when really what we need is request routing, a database query, and a response handler?  

Blackfire reports that 304kb of data was transmitted across the wire for this.  That seems like a lot of data for the five or six records that I'm returning to the frontend.

The call graph is frustrating – I'm lumbered with a whole lot of black box code and I have no control over the SQL that is being run.  How can I improve the performance of this transaction?

So is PHP the best tool for this job?

I have previously had intractable problems with PHP when it comes to memory management.  It's pretty complicated and it differs depending on the way that PHP is run but I do not have 100% confidence in PHP's garbage collection.  

Circular object references (which I encountered while using an ORM where a model referenced itself as a parent to form a hierarchy) cannot be completely collected by PHP.  PHP actually relies on the container the machine runs in to collect this memory.  


PHP is not built for being a long-running program.  It was never designed for this and it should never be used for this.  It was built to handle a request for a page and then terminate. 

The application is bootstrapped for every request.  How much overhead does this add?  Well there's a question that Blackfire is raising for me.  Take a look at the timeline for the transaction from before:
Blackfire call timeline
The timeline shows when a PHP function was called in relation to the time taken to generate the response. 

My controller function starts at around 750ms into the transaction.  The actual time is irrelevant as a benchmark, but the fact that the first time *my* code runs is half-way into the transaction is what is relevant.   

Until halfway into my application I've been waiting for PHP to bootstrap my application.  You could argue that this is because of the PHP framework I'm using, but actually it is the limitation of PHP not being able to maintain state that requires us to continuously bootstrap the application.

Bootstrapping our application might involve disk I/O (depending on OpCache).  It definitely involves network I/O because we have to connect to MySQL and wait for it to authenticate us.  I know that there are ways to improve this, like by not using a framework and by tuning OpCache to improve compile time.

I'm concerned about what will happen when the application has 50,000 concurrent users.  How much of a strain will it place on my database server to be constantly connecting (and authenticating)?

I think PHP is brilliant at web pages and not so good at being a long-running application that is capable of reusing resources.  I'm a huge PHP fan but as an architect I do not want it to be my only tool.  

I'm busy learning Elixir and the Phoenix framework (again with the frameworks!) response in microseconds (not milliseconds).  I don't think we should using PHP like the hammer we use for everything.

Comments

Popular posts from this blog

Separating business logic from persistence layer in Laravel

There are several reasons to separate business logic from your persistence layer.  Perhaps the biggest advantage is that the parts of your application which are unique are not coupled to how data are persisted.  This makes the code easier to port and maintain. I'm going to use Doctrine to replace the Eloquent ORM in Laravel.  A thorough comparison of the patterns is available  here . By using Doctrine I am also hoping to mitigate the risk of a major version upgrade on the underlying framework.  It can be expected for the ORM to change between major versions of a framework and upgrading to a new release can be quite costly. Another advantage to this approach is to limit the access that objects have to the database.  Unless a developer is aware of the business rules in place on an Eloquent model there is a chance they will mistakenly ignore them by calling the ActiveRecord save method directly. I'm not implementing the repository pattern in all its ...

Using Azure Active directory as an OAuth2 provider for Django

Azure Active Directory is a great product and is invaluable in the enterprise space. In this article we'll be setting it up to provide tokens for the OAuth2 client credentials grant. This authorization flow is useful when you want to authorize server-to-server communication that might not be on behalf of a user. This diagram, by Microsoft, shows the client credentials grant flow. From Microsoft documentation  The flow goes like this: The client sends a request to Azure AD for a token Azure AD verifies the attached authentication information and issues an access token The client calls the API with the access token. The API server is able to verify the validity of the token and therefore the identity of the client. The API responds to the client Setting up Azure AD as an OAuth2 identity provider The first step is to create applications in your AD for both your API server and the client. You can find step-by-step instructions on how to register the applications o...

"Word of the Day" PHP script (with word list)

I was looking around for a way to generate a word of the day on the web and didn't find anything. So I coded a quick and dirty script to do it. Just in case anybody does a Google search and manages to find my blog: here is my Word of the Day PHP script : Copy this code snippet into a wordoftheday.php file: $file = fopen("interesting_words.txt","r"); $raw_string = fread($file,filesize("interesting_words.txt")); fclose($file); $words_array = explode("|",$raw_string); echo $words_array[array_rand($words_array)]; Of course the real issue I had was finding a list of interesting words in the right format. Here is the list of interesting words that I used: Copy this into a file called interesting_words.txt : ubiquitous : being or seeming to be everywhere at the same time; omnipresent| ecdysiast : a striptease artist| eleemosynary : of, relating to, or dependent on charity| gregious : c...