Adventures in IT

Posts

Showing posts from March, 2015

Ignoring duplicate inserts with Postgres when processing a batch

I'm busy on a project which involves importing fairly large datasets of about ~3.3GB at a time. I have to read a CSV file, process each line, and generate a number of database records from the results of that process. Users are expected to be able to rerun batches and there is overlap between different datasets. For example: the dataset of "last year" overlaps with the dataset of "all time". This means that we need an elegant way to handle duplicate updates. Searching if a record exists (by PK) is fine until the row count in the table gets significant. At just over 2 million records it was taking my development machine 30 seconds to process 10,000 records. This number steadily increased as the row count increased. I had to find a better way to do this and happened across the option of using a database rule to ignore duplicates. While using the rule there is a marked improvement in the performance as I no longer need to search the database fo...

Adding info to Laravel logs

I am coding a queue worker that is handling some pretty large (2gig+) datasets and so wanted some details in my logs that Vanilla laravel didn't offer. Reading the documentation at http://laravel.com/docs/4.2/errors wasn't much help until I twigged that I could manipulate the log object returned by Log :: getMonolog ( ) ; . Here is an example of adding memory usage to Laravel logs. In app/start/global.php make the following changes Log::useFiles(storage_path().'/logs/laravel.log'); $log = Log::getMonolog(); $log->pushProcessor(new Monolog\Processor\MemoryUsageProcessor); You'll find the Monolog documentation on the repo

Support for Postgres broken in HHVM 3.6.0

On my desktop machine I run my package upgrades every day. The other day my Hiphop version got updated to 3.6.0 and suddenly my Postgres support died. Running Hiphop gave a symbol not found error in the postgres.so file ( undefined symbol: _ZTIN4HPHP11PDOResourceE\n ) exactly like the issue reported on the driver repository ( here ). I tried to recompile the postgres driver against Hiphop 3.6.0 but hit a number of problems, mostly to do with hhvm-pgsql-master/pdo_pgsql_statement.cpp it seems. The fix for the incompatibility was unfortunately rolling back to my previous version of Hiphop. To do this on Mint/Ubuntu just do this: Run cat /etc/*-release to get your release information Download the appropriate package for your distro from http://dl.hhvm.com/ubuntu/pool/main/h/hhvm/ Remove your 3.6.0 installation of hhvm: sudo apt-get remove hhvm Install the package you downloaded : sudo dpkg -i <deb package> After that everything should be ...