30 March 2015
Ignoring duplicate inserts with Postgres when processing a batch
Users are expected to be able to rerun batches and there is overlap between different datasets. For example: the dataset of "last year" overlaps with the dataset of "all time". This means that we need an elegant way to handle duplicate updates.
Searching if a record exists (by PK) is fine until the row count in the table gets significant. At just over 2 million records it was taking my development machine 30 seconds to process 10,000 records. This number steadily increased as the row count increased.
I had to find a better way to do this and happened across the option of using a database rule to ignore duplicates. While using the rule there is a marked improvement in the performance as I no longer need to search the database for a record.
There are so many different problems that people have with the Doctrine error message: exception 'Doctrine\ORM\ORMInvalidArgument...
Azure Active Directory is a great product and is invaluable in the enterprise space. In this article we'll be setting it up to provide ...
While debugging and setting up Puppet I am still running the agent and master from CLI in --no-daemonize mode. I kept getting an error on...