Archive for the ‘PostgreSQL’ Category

Deferrable primary/unique keys constraints still might result in duplicate values due to updates (even in recent 12c)

Tuesday, April 19th, 2016

I’m still surprised that Oracle PSU 12.1.0.2.160119 (January 2016) is apparently still bugged by old zombie bug of the past .

BTW: On most recent PostgreSQL 9.4 series (9.4.6) it works OK (fails with error).

Benchmarking PostgreSQL’s INSERT/COPY data loading performance (in one TCP stream)

Tuesday, January 6th, 2009

As everyone would expect “COPY FROM” wins, but in real conditions performance degradation seems to come from network/IPv4-stack along with PostgreSQL overhead, not CPU or I/O. PostgreSQL 8.3.5 (client libraries were 8.2.x) running on Xen VirtualMachine with 2 VCPU and 1GB RAM allocated from Quad Core P4 (2.4GHz), no other Xen VM (except of course dom0) was running during those tests. VM was on software RAID0 on 3xSATA2 7.2k RPM. Pgdata and xlog both were on the same RAID array, also PostgreSQL had fsync disabled (for performance). FS: ext3 with noatime, standard journal mode. More important PostgreSQL parameteres are below:

shared_buffers = 256MB
temp_buffers = 16MB
max_prepared_transactions = 10
work_mem = 4MB
maintenance_work_mem = 32MB
max_fsm_pages = 153600
fsync = off
wal_buffers = 512kB
archive_mode = off

Real test #1: Benchmark machine (my workstation, 1.5GHz Sempron, running X11, mostly idle) separated by 100mbps network (this is intended!: to simulate the real enviorniment in MAN network), average bench.pl CPU usage is about 10-15% (reported by top). During “INSERT” tests benchmarked database was *IDLE* (reported by vmstat was mostly 99-100 all the time) but network communication generates ~2-3k packet per seconds and ~1.5-1.9 Mbps. Bench.pl parameters were $items = 20000 (random items to INSERT/COPY) and $loops = 10 (tests to conduct to avoid bad results).

postgresql_ins_bench_100mbps

Artificial test #2: Benchmark running on the same machine as database (loopback communication only). This one really stresses CPU and IO, and it’s also taking some CPU cycles for performing benchmarking (i’ve plans to include Gigabit Ethernet in my lab ;) ) Bench.pl settings were $loops = 10 and $items = 40000.

postgresql_ins_bench_localhost

To see results please click on the above images. Long story quick: if you intend to perform more than several INSERT to be done to the same table consider using COPY statement (especially over slow networks, but also on high-bandwidth links to reduce PostgreSQL, TCP and OS overheads).

Legend:

  • ins_adhoc – each time for every INSERT there is prepare+execute being done at the DBI level
  • ins_prep – for each record prepared INSERT statement is being used
  • ins_func – for every record there is prepared & executed function that performs real INSERT
  • ins_func_prep – the same as “ins_func” but statement for executing function is prepared once
  • copy – COPY statement is being executed once for all records