import sqlalchemy as sa
= sa.create_engine('postgresql://your_username:your_password@localhost:your_port/your_database') engine
How to write a pandas dataframe into PostgreSQL in Python
In this tutorial, you will learn how to write a pandas dataframe into your PostgreSQL database.
Prerequisites:
- You need to have PostgreSQL installed and running on your machine. Therefore, you need to actually start and initialize PostgreSQL after you have completed the installation.
Installation of Python modules
First of all, we use Anaconda and pip to install some Python modules. Open your terminal (macOS) or your Anaconda Command Prompt (Windows) and enter:
conda create -n postgres python=3.11 pip
Activate the environment:
conda activate postgres
Let’s install some packages. Psycopg is a PostgreSQL adapter for the Python programming language and SQLAlchemy is a Python SQL toolkit:
pip install psycopg2-binary sqlalchemy ipykernel jupyter pandas
Database connection
To connect to your database, you need to start a PostgreSQL instance. Therefore, open Postgres on your machine and confirm that your database is running.
Now we import sqalchemy
as sa
and create an postgresql engine to make a connection to our database.
In this engine we need to provide the following information:
Replace your_username
, your_password
, your_port
and your_database
with your own values.
Here are some standard values for a common PostgreSQL installation:
- your_username:
postgres
- your_port:
5432
or5433
or5438
, … (you can find your port number in the settings of your PostgreSQL database) - your_database:
postgres
To learn more about the engine configuration, visit the documentation.
Write pandas dataframe into PostgreSQL
Let’s create some data:
import pandas as pd
= pd.read_csv("https://raw.githubusercontent.com/kirenz/datasets/master/gapminder.csv") df
Now we use our engine
to write the pandas data into our PostgreSQL database and call the table gapminder
:
'gapminder', engine, if_exists='replace') df.to_sql(
Show a list of all tables in the database:
# Querying for the list of table names
with engine.connect() as connection:
= connection.execute(sa.text("SELECT table_name FROM information_schema.tables WHERE table_schema='public'"))
tables for table in tables:
print(table[0])
Code to drop the table from the databse:
= sa.text("DROP TABLE IF EXISTS gapminder")
sql
with engine.connect() as connection:
connection.execute(sql)