Bulk insert into database table using placeholders, Postgres and Python 3

Featured image

I was refactoring and optimizing the code we have for writing into our database today. At some point I had problems bulk inserting into a table while inserting values some of which we UUIDs. This is what I did after that:


Standard psql syntax for inserting data into a db table looks like:

services (service_id, service_name)
VALUES ('86a1bf1d-523f-4b64-a333-ce8e1b6d8c56', 'Deliver Food');

This inserts a new service to the services table the the db, which has an unique id ‘86a1bf1d-523f-4b64-a333-ce8e1b6d8c56’ and is written in the service_id column in the table. The service name - ‘Deliver Food’, is written in the service_name column of the table.


To insert more than one services into the table, we need to call:

services (service_id, service_name) 
('86a1bf1d-523f-4b64-a333-ce8e1b6d8c56', 'Deliver Food'), 
('2b051dd2-70cc-4f06-a27f-767cb6309533', 'Deliver Drinks');

Psycopg2: how to handle UUIDs instead of converting them to strings

Simply call the method register_uuid() before you start generating and handling UUIDs:

import psycopg2.extras

This solved this error:

can't adapt type 'UUID'

Python method to bulk insert using placeholders

This Stackoverflow answer helped me discover a quick way to programatically bulk insert formatting the values inserted in the table using cur.mogrify (docs), however it was producing a binary string to be concatenated to the command string which yielded the following error:

sequence item 0: expected str instance, bytes found

To solve it, I needed to cast the binary string to a normal one adding decode('urf-8') to cur.mogrify (see method below).

And putting it al together:

def bulk_insert_into_table(cur, table_name, column_names, data):
    """Build a psql string command to bulk insert into a table

                :param cur: the db cursor
                :type id: cursor
                :param table_name: the name of the existing table in which the insertion happens
                :type name: str
                :param column_names: a list of strings -> the strings are the column names of the table
                :type name: str
                :param data: a list of tuples, each tuple holding the values to the inserted into the table
                :type name: str
    column_names = ','.join(column_names)

    args_str = b','.join(cur.mogrify("(%s,%s)", x) for x in data).decode("utf-8")
    # decode was needed as args_str was a binary string which failed to be inserted into the command string

    command = """ INSERT INTO %s (%s) VALUES %s;""" % (table_name, column_names, args_str)

    return command
conn = psycopg2.connect(database = psql_db, user = psql_user, password = psql_ps,
                            host = psql_url, port = psql_port)
cur = conn.cursor()

table_name = 'services'
column_names = ["service_id", "service_name"]
data = [(uuid4(), 'Deliver Food'),
        (uuid4(), 'Deliver Drinks')]

command = bulk_insert_into_table(cur, table_name, column_names, data)

conn.commit() # <--- makes sure the change is shown in the database