Intro
Ext. Links
Manuals
PostgreSQL internals (Hironobu SUZUKI)
DevOps 101
BºCore ENV.VARSº
PGPORT
PGUSER (alt. -U option)
BºShow DB infoº
mydb=˃ SELECT version();
mydb=˂˃ SLECT current_date;
psql commands:
\?
\h
\l list databases
\c db_name connect to ddbb
\q quit
\dt show all tables in ddbb
\dt º.º. show all tables globally
\d table show table schema
\d+ table
\du list current user's permissions
\u
SELECT current_database();
SELECT rolname FROM pg_roles; ← List users
Instance "layout"
BºPostgresSQL instance entitiesº:
Server == DDBB Cluster 1 ←→ N GºCatalogº 1 ←→ N Schema
└──────────┬──────────┘ Gº(Database)º └──┬─┘
│ NAMESPACE FOR TABLES
│ and security boundary
┌─────────┘ └─────────┬─────────┘
├GºDatabasesº ┌────────────────┘
│ │ ┌───────┴───────┐
│ ├─ postgres ºSCHEMA COMMANDSº
│ │ │
│ │ ├─ Casts \dn ← list existing schemas
│ │ │ SELECT schema_name FROM information_schema.schemata;
│ │ ├─ Catalogsº*1º SELECT nspname FROM pg_catalog.pg_namespace;
│ │ │
│ │ ├─ Event Triggers @[http://www.postgresql.org/docs/current/static/sql-createschema.html]
│ │ │ ºCREATE SCHEMAºIF NOT EXISTS ˂schema_name˃;
│ │ ├─ Extensions
│ │ │ @[http://www.postgresql.org/docs/current/static/sql-dropschema.html]
│ │ ├─ Foreing ºDROP SCHEMAºIF EXISTS ˂schema_name˃ CASCADE;
│ │ │ Data Wrap └───┬────────────...
│ │ │ │
│ │ ├─ Languages │
│ │ │ │
│ │ └─ Schemas ←─────────────┘
│ │
│ ├─ myDDBB01
│ │ │
│
├─ Login/Group @[https://www.postgresql.org/docs/10/static/user-manag.html]
│
└─ Tablespaces
º*1º: Difference between catalog and schema
REF: @[https://stackoverflow.com/questions/7022755/whats-the-difference-between-a-catalog-and-a-schema-in-a-relational-database]
"... So in both Postgres and the SQL Standard we have this containment hierarchy ..."
- A computer may have one cluster or multiple.
- A database server is a cluster.
- A cluster has catalogs. ( Catalog = Database )
- Schemas have tables.
- Tables have rows.
- Rows have values, defined by columns.
- Those values are the business data your apps and users care about such as
person's name, invoice due date, product price, gamer’s high score.
The column defines the data type of the values (text, date, number, and so on).
BºBootstrap new Postgresql Server (Cluster)º
Bootstrap Cluster == Init storage area
(data directory in FS terms)
Server 1 → N Databases
$º$ sudo su postgres º ← Switch to postgres OS user
$º$ initdb -D /usr/local/pgsql/data º ← alt.1
$º$ pg_ctl -D /usr/local/pgsql/data initdbº ← alt.2
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
'postgres' and 'template1' ddbbs will be created automatically
BºStarting the Serverº
$º$ sudo su postgres º ← Switch to postgres OS user
$º$ postgres -D /usr/local/pgsql/data \ º ← Alt.1
$º$ 1˃/var/log/postgresql/logfile 2˃⅋1 ⅋ º
$º$ pg_ctrl start \ º ← Alt.2. Using pg_ctl "easy" wrapper
$º$ -l /var/log/postgresql/logfile' º
$º$ sudo su root º ← Alt.3. SystemD (recomended in SystemD enabled OSs)
$º$ systemctl enable postgresql.service º ← Enable at OS reboot
$º$ systemctl start postgresql.service º ← Start now without rebooting
$º$ journalctl --unit postgresql.service º ← Check SystemD unit logs
BºCRUD Usersº
$º$ sudo su postgres º ← Switch to postgres OS user
$º$ psql º
# ºCREATE USER IF NOT EXISTSºmyUser01ºWITH PASSWORDº'my_user_password';
# ºALTER USERºmyUser01ºWITH PASSWORDº'my_new_password';
# ºDROP USER IF EXISTSºmyUser01 ;
BºGranting privileges to usersº
#ºGRANT ALL PRIVILEGES ONºtableºTOºmyUser01; ← Grant all permissions to table
#ºGRANT ALL PRIVILEGES ON DATABASEºmyDB01ºTOºmyUser01; ← Grant all permissions in DB
(tables, connection, ...)
#ºGRANT CONNECT ON DATABASEºmyDB01ºTOºmyUser01; ← Grant connection permissions to DB
#ºGRANT USAGE ON SCHEMAºpublicºTOºmyUser01; ← Grant permissions on schema
#ºGRANT EXECUTE ON ALL FUNCTIONS IN SCHEMAºpublic ← Grant permissions to functions
ºTOºmyUser01;
#ºGRANT SELECT, UPDATE, INSERT ON ALL TABLES IN SCHEMAº← Grant permissions select/... on all tables
publicºTOºmyUser01;
#ºGRANT SELECT, INSERT ONºmyTable01ºTOºmyUser01; ← Grant permissions on a single table
BºCreate/Remove new DDBBº
(PRE-SETUP: Create user account for the new DDBB, ex: department01, ...)
$º$ createdb mydbº ← create new
$º$ psql mydb º ← access it to check everything is OK
$º$ dropdb mydbº ← Remove. RºWARNº: can NOT be undone
└───────┬──────┘
NOTE: Many tools assume ddbb names -U flag or PGUSER Env.Var as username
by default
BºCreate tableº
-- DO $$
-- BEGIN
-- EXECUTE 'ALTER DATABASE ' || current_database() || ' SET TIMEZONE TO UTC';
-- END; $$;
CREATE TABLE IF NOT EXISTS myTable01 (
ID VARCHAR(40) NOT NULL
ºCONSTRAINTº
MYTABLE01_PKºPRIMARY KEYº,
NAME VARCHAR(255) NOT NULL,
CREATED_AT TIMESTAMP DEFAULT NOW() NOT NULL,
PUB_KEY VARCHAR(88) NOT NULL UNIQUE,
PARENT_ID VARCHAR(40) RºNULLº
CONSTRAINT
myTable01_FK REFERENCES myTable01(ID),
);
-- CREATE NEW INDEX
ºCREATE IF NOT EXISTS INDEXº ← Create new index
MYTABLE01_PUBKEY_IDXºONº
myTable01 (PUB_KEY);
ALTER TABLE myTable01ºADD PRIMARY KEYº(ID); ← Alternative to create PRIMARY KEY
Client Authentication
@[https://www.postgresql.org/docs/10/static/client-authentication.html]
pg_hba.conf
(AAA+Encryption)
@[https://www.postgresql.org/docs/devel/static/auth-pg-hba-conf.html]
- the default client authentication setup allows any local user
to connect to the database and even become the database superuser.
If you do not trust them:
1 - use one initdb -W, --pwprompt or --pwfile options
to assign a password to the database superuser.
2 - also, specify -A md5 or -A password so that the
default trust authentication mode is not used;
or modify the generated ºpg_hba.confº file after running initdb
, but before you start the server for the first time.
- TSL
allows both the client and server to provide SSL certificates to each other.
- Encryp. Options
- The pg_hba.conf file allows administrators to specify which hosts can use
non-encrypted connections (host) and which require SSL-encrypted
connections (hostssl). Also, clients can specify that they connect to
servers only via SSL.
- The pgcrypto module allows certain fields to be stored encrypted. This
is useful if only some of the data is sensitive. The client supplies the
decryption key and the data is decrypted on the server and then sent to the client.
Performance Optimization
- Enable autovacuum. The working memory for autovacuum should be no more than 2% of
the total available memory.
- Enable database caching with an effective cache size between 6% and 8% of the total
available memory.
- To increase performance, the working memory should be at least 10% of the total
available
Set SERVER_ENCODING , LC_COLLATE and LC_CTYPE as :
server_encoding = UTF8
lc_collate = en_US.UTF-8
lc_ctype = en_US.UTF-8
Tunning OS
(Shared Memory/Semaphores/...):
- Show all runtime parameters:
º#- SHOW ALL;º
PSQL vs MSQL comparative
(for data analytics)
PSQL: PostgreSQL's CSV support is very good with RFC4180 support
(which is the closest thing there is to an official CSV standard)
COPY TO
COPY FROM
Helpul error message in case of error with fail-fast approach:
- Abort import on problem
(vs silently corrupt, misunderstand or alter data)
PostgreSQL | MS SQL Server:
DROP TABLE IF EXISTS my_table; |
| IF OBJECT_ID (N'dbo.my_table', N'U') IS NOT NULL
| DROP TABLE dbo.my_table;
- PSQL supports DROP SCHEMA CASCADE:
º This is very, very important for a robust analytics delivery methodology,º
ºwhere tear-down-and-rebuild is the underlying principle of repeatable, º
ºauditable, collaborative analytics work. º
- drop schema and all the database objects inside it.
PostgreSQL | MS SQL Server
CREATE TABLE good_films AS | SELECT
SELECT | *
* | INTO
FROM | good_films
all_films | FROM
WHERE | all_films
imdb_rating ˃= 8; | WHERE
- In PostgreSQL, you can execute as many SQL statements as you like in one
batch; as long as you've ended each statement with a semicolon, you can
execute whatever combination of statements you like. For executing
automated batch processes or repeatable data builds or output tasks, this
is critically important functionality.
- PostgreSQL supports the RETURNING clause, allowing UPDATE, INSERT and
DELETE statements to return values from affected rows. This is elegant and
useful. MS SQL Server has the OUTPUT clause, which requires a separate
table variable definition to function. This is clunky and inconvenient and
forces a programmer to create and maintain unnecessary boilerplate code.
- PostgreSQL supports $$ string quoting, like so:
SELECT $$Hello, World$$ AS greeting;
This is extremely useful for generating dynamic SQL because (a) it allows
the user to avoid tedious and unreliable manual quoting and escaping when
literal strings are nested and (b) since text editors and IDEs tend not to
recogniise $$ as a string delimiter, syntax highlighting remains functional
even in dynamic SQL code.
- PostgreSQL lets you use procedural languages simply by submitting code to
the database engine; you write procedural code in Python or Perl or R or
JavaScript or any of the other supported languages (see below) right next
to your SQL, in the same script. This is convenient, quick, maintainable,
easy to review, easy to reuse and so on.
- "Pure" declarative SQL is good at what it was designed for – relational
data manipulation and querying. You quickly reach its limits if you try to
use it for more involved analytical processes, such as complex interest
calculations, time series analysis and general algorithm design. SQL
database providers know this, so almost all SQL databases implement some
kind of procedural language. This allows a database user to write imperative
- style code for more complex or fiddly tasks.
- PostgreSQL's procedural language support is exceptional:
- PL/PGSQL: this is PostgreSQL's native procedural language. It's like Oracle's
PL/SQL, but more modern and feature-complete.
- PL/V8: the V8 JavaScript engine from Google Chrome is available in PostgreSQL.
Even better, PL/V8 supports global (i.e. cross-function call) state,
allowing the user to selectively cache data in RAM for fast random access.
Suppose you need to use 100,000 rows of data from table A on each of 1,000,000
rows of data from table B. In traditional SQL, you either need to join
these tables (resulting in a 100bn row intermediate table, which will
kill any but the most immense server) or do something akin to a scalar
subquery (or, worse, cursor-based nested loops), resulting in crippling
I/O load if the query planner doesn't read your intentions properly.
º In PL/V8 you simply cache table A in memory and run a function on each º
ºof the rows of table B – in effect giving you RAM-quality access ( º
ºnegligible latency and random access penalty; no non-volatile I/O load) º
ºto the 100k-row table. I did this on a real piece of work recently – my º
ºPostgreSQL/PLV8 code was about 80 times faster than the MS T-SQL solution º
ºand the code was much smaller and more maintainable. Because it took about º
º23 seconds instead of half an hour to run, I was able to run 20 run-test-modifyº
ºcycles in an hour, resulting in feature-complete, properly tested, bug-free º
ºcode. º
(All those run-test-modify cycles were only possible because of DROP SCHEMA CASCADE
and freedom to execute CREATE FUNCTION statements in the middle of a statement
batch, as explained above. See how nicely it all fits together?)
- PL/Python: Fancy running a SVM from scikit-learn or some
arbitrary-precision arithmetic provided by gmpy2 in the middle of a SQL query?
No problem!
- PL/R
- C: doesn't quite belong in this list because you have to compile it
separately, but it's worth a mention. In PostgreSQL it is trivially easy
to create functions which execute compiled, optimised C (or C++ or assembler)
in the database backend.
- In PostgreSQL, custom aggregates are convenient and simple to use,
resulting in fast problem-solving and maintainable code:
CREATE FUNCTION interest_sfunc(state JSON, movement FLOAT, rate FLOAT, dt DATE) RETURNS JSON AS
$$
state.balance += movement; //payments into/withdrawals from account
if (0 === dt.getUTCDate()) //compound interest on 1st of every month
{
state.balance += state.accrual;
state.accrual = 0;
}
state.accrual += state.balance * rate;
return state;
$$ LANGUAGE plv8;
CREATE AGGREGATE interest(FLOAT, FLOAT, DATE)
(
SFUNC=interest_sfunc,
STYPE=JSON,
INITCOND='{"balance": 0, "accrual": 0}'
);
--assume accounts table has customer ID, date, interest rate and account movement for each day
CREATE TABLE cust_balances AS
SELECT
cust_id,
(interest(movement, rate, dt ORDER BY dt)-˃˃'balance')::FLOAT AS balance
FROM
accounts
GROUP BY
cust_id;
Elegant, eh? A custom aggregate is specified in terms of an internal state
and a way to modify that state when we push new values into the aggregate
function. In this case we start each customer off with zero balance and no
interest accrued, and on each day we accrue interest appropriately and
account for payments and withdrawals. We compound the interest on the 1st
of every month. Notice that the aggregate accepts an ORDER BY clause (since
, unlike SUM, MAX and MIN, this aggregate is order-dependent) and
PostgreSQL provides operators for extracting values from JSON objects. So,
in 28 lines of code we've created the framework for monthly compounding
interest on bank accounts and used it to calculate final balances. If
features are to be added to the methodology (e.g. interest rate
modifications depending on debit/credit balance, detection of exceptional
circumstances), it's all right there in the transition function and is
written in an appropriate language for implementing complex logic. (Tragic
side-note: I have seen large organisations spend tens of thousands of
pounds over weeks of work trying to achieve the same thing using poorer tools.)
- Date/Time
- PostgreSQL: you get DATE, TIME, TIMESTAMP and TIMESTAMP WITH TIME ZONE,
all of which do exactly what you would expect. They also have fantastic
range and precision, supporting microsecond resolution from the 5th
millennium BC to almost 300 millennia in the future. They accept input in a
wide variety of formats and the last one has full support for time zones
- They can be converted to and from Unix time, which is very important
for interoperability with other systems.
- They also support the INTERVAL type, which is so useful it has its own
section right after this one.
SELECT to_char('2001-02-03'::DATE, 'FMDay DD Mon YYYY'); ← "Saturday 03 Feb 2001"
SELECT to_timestamp('Saturday 03 Feb 2001', 'FMDay DD Mon YYYY'); ←TS 2001-02-03 00:00:00+00
- PostgreSQL: the INTERVAL type represents a period of time, such as "30
microseconds" or "50 years". It can also be negative, which may seem
counterintuitive until you remember that the word "ago" exists. PostgreSQL
also knows about "ago", in fact, and will accept strings like '1 day ago'
as interval values (this will be internally represented as an interval of -
1 days). Interval values let you do intuitive date arithmetic and store
time durations as first-class data values. They work exactly as you expect
and can be freely casted and converted to and from anything which makes sense
- PostgreSQL arrays are supported as a first-class data type
- eaning fields in tables, variables in PL/PGSQL, parameters to functions
and so on can be arrays. Arrays can contain any data type you like,
including other arrays. This is very, very useful. Here are some of the
things you can do with arrays:
- Store the results of function calls with arbitrarily-many return values, such as regex matches
- Represent a string as integer word IDs, for use in fast text matching algorithms
- Aggregation of multiple data values across groups, for efficient cross-tabulation
- Perform row operations using multiple data values without the expense of a join
- Accurately and semantically represent array data from other applications in your tool stack
- Feed array data to other applications in your tool stack
- PostgreSQL: full support for JSON, including a large set of utility functions for
transforming between JSON types and tables (in both directions)
- PostgreSQL: HSTORE is a PostgreSQL extension which implements a fast key-value store as a data type.
Like arrays, this is very useful because virtually every high-level programming language has such
a concept (associative arrays, dicts, std::map ...)
There are also some fun unexpected uses of such a data type. A colleague
recently asked me if there was a good way to deduplicate a text array. Here's
what I came up with:
SELECT akeys(hstore(my_array, my_array)) FROM my_table;
i.e. put the array into both the keys and values of an HSTORE, forcing a
dedupe to take place (since key values are unique) then retrieve the keys
from the HSTORE. There's that PostgreSQL versatility again.
- PostgreSQL:range types.
Every database programmer has seen fields called start_date and end_date,
and most of them have had to implement logic to detect overlaps. Some have even found, the hard way,
that joins to ranges using BETWEEN can go horribly wrong, for a number of reasons.
PostgreSQL's approach is to treat time ranges as first-class data types. Not only can you put a
range of time (or INTs or NUMERICs or whatever) into a single data value, you can use a host of
built-in operators to manipulate and query ranges safely and quickly. You
can even apply specially-developed indices to them to massively accelerate
queries that use these operators.
- PostgreSQL: NUMERIC (and DECIMAL - they're symonyms) is near-as-dammit arbitrary
precision: it supports 131,072 digits before the decimal point and 16,383 digits after the decimal point.
- PostgreSQL: XML/Xpath querying is supported
- PostgreSQL's logs, by default, are all in one place. By changing a couple of settings in a text file,
you can get it to log to CSV (and since we're talking about PostgreSQL, it's proper CSV, not broken CSV).
You can easily set the logging level anywhere from "don't bother logging
anything" to "full profiling and debugging output".
The documentation even contains DDL for a table into which the CSV-format
logs can be conveniently imported.
You can also log to stderr or the system log or to the Windows event log
(provided you're running PostgreSQL in Windows, of course).
The logs themselves are human-readable and machine-readable and contain data
likely to be of great value to a sysadmin.
Who logged in and out, at what times, and from where? Which queries are being
run and by whom?
How long are they taking? How many queries are submitted in each batch?
Because the data is well-formatted CSV,
it is trivially easy to visualise or analyse it in R or PostgreSQL itself or
Python's matplotlib or whatever you like.
- PostgreSQL comes with a set of extensions called contrib modules. There are libraries of functions,
types and utilities for doing certain useful things which don't quite fall
into the core feature set of the server. There are libraries for fuzzy
string matching, fast integer array handling, external database connectivity,
cryptography, UUID generation, tree data types and loads, loads more. A few
of the modules don't even do anything except provide templates to allow
developers and advanced users to develop their own extensions and custom functionality.
Non classified
PostGraphile
- builds a powerful, extensible and performant GraphQL API from a
PostgreSQL schema in seconds; saving you weeks if not months of
development time.
- If you already use PostgreSQL then you understand the value that a
strongly typed and well defined schema can bring to application
development; GraphQL is the perfect match for this technology when it
comes to making your data layer accessible to your frontend
application developers (or even API clients). Why duplicate your
authorization and business logic in a custom API when you can
leverage the tried and tested capabilities built into the worlds most
advanced open source database?
- If you are new to GraphQL then we recommend you read through the
official introduction to GraphQL here before continuing through the
PostGraphile documentation.
- By combining powerful features such as PostgreSQL's role-based
grant system and row-level security policies with Graphile Engine's
advanced GraphQL look-ahead and plugin expansion technologies,
PostGraphile ensures your generated schema is secure, performant and
extensible.
Some of the features we offer:
- Incredible performance - no N+1 query issues
- Extensibility via schema and server plugins
- Auto-discovered relations e.g. userByAuthorId
- Computed columns allowing easy expansion of your API
- Custom query procedures enabling arbitrary SQL queries
- Automatic CRUD mutations e.g. updatePost
- Custom mutation procedures enabling complex changes to be exposed simply
- Real-time features powered by LISTEN/NOTIFY and/or logical decoding
- The easiest way to get started is with the CLI interface;
$º$ npx postgraphile -c postgres://user:pass@localhost/mydb \ º
$º --watch --enhance-graphiql --dynamic-json º
(replacing user, pass and mydb with your PostgreSQL username,
password and the name of your database)
Liquibase
@[http://www.liquibase.org/]
Source control for the DDBB schema.
- Eliminate errors and delays when releasing databases.
- Deploys and Rollback changes for specific versions without needing
to know what has already been deployed.
- Deploy database and application changes together so they always
stay in sync.
- Supports code branching and merging
- Supports multiple developers
- Supports multiple database types
- Supports XML, YAML, JSON and SQL formats
- Supports context-dependent logic
- Cluster-safe database upgrades
- Generate Database change documentation
- Generate Database "diffs"
- Run through your build process, embedded in your application or on demand
- Automatically generate SQL scripts for DBA code review
- Does not require a live database connection
Java Maven Plugin:
@[https://docs.liquibase.com/tools-integrations/maven/home.html]
AWS Serverless PostgreSQL
@[https://aws.amazon.com/blogs/aws/amazon-aurora-postgresql-serverless-now-generally-available/]
- "serverless" relational database service (RDS) in AWS Aurora.
- Automatically starts, scales, and shuts down database capacity
- per-second billing for applications with less predictable usage patterns.
- It's a *different implementation of the standard versions of these open-source databases.
From the RDS console:
- select the Amazon Aurora database engine PostgreSQL
- set new DB cluster identifier, specification of credentials,
- set capacity:
- minimum and maximum capacity units for their database, in terms of Aurora Capacity Units (ACUs)
– a combination of processing and memory capacity. Besides defining the ACUs, users can also
determine when compute power should stop after a certain amount of idle time.
ºhow capacity settings will work once the database is availableº
- client apps transparently connect to a proxy fleet
that routes the workload to a pool of resources that
are automatically scaled.
- Scaling is very fast because resources are "warm" and
ready to be added to serve your requests.
- Minimum storage: 10GB, will automatically grow up to 64 TB
(based on the database usage) in 10GB increments
ºwith no impact to database performanceº
ºpricing modelsº
- On-Demand Instance Pricing: pay by hour, no long-term commitments
- Reserved Instance Pricing: steady-state database workloads
ºCostº
@[https://aws.amazon.com/rds/aurora/pricing/]
Reactive java Client
@[https://github.com/vietj/reactive-pg-client]
@[https://github.com/eclipse-vertx/vertx-sql-client/tree/3.8/vertx-pg-client]
High performance reactive PostgreSQL client written in Java
(By Julien Viet, core developer of VertX and Java Crash shell)
Patroni
https://github.com/zalando/patroni
Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul
You can find a version of this documentation that is searchable and also easier
to navigate at patroni.readthedocs.io.
There are many ways to run high availability with PostgreSQL; for a list, see
the PostgreSQL Documentation.
Patroni is a template for you to create your own customized, high-availability
solution using Python and - for maximum accessibility - a distributed
configuration store like ZooKeeper, etcd, Consul or Kubernetes. Database
engineers, DBAs, DevOps engineers, and SREs who are looking to quickly deploy
HA PostgreSQL in the datacenter-or anywhere else-will hopefully find it useful.
We call Patroni a "template" because it is far from being a one-size-fits-all
or plug-and-play replication system. It will have its own caveats. Use wisely.
Note to Kubernetes users: Patroni can run natively on top of Kubernetes. Take a
look at the Kubernetes chapter of the Patroni documentation.
Error Reporting+Logging
@[https://www.postgresql.org/docs/current/static/runtime-config-logging.html]
Regex support
- fundamental in analytics work involving text processing tasks.
""" A data analytics tool without regex support is like a bicycle without a
saddle – you can still use it, but it's painful. """
- Great support in PostgreSQL
RºWARN:º Non-portable to other DDBBs.
- Exs:
> SELECT * FROM my_table
WHERE my_field ~ E'^([0-9])\\1+[aeiou]'; ← Get all lines starting with a repeated digit
followed by a vowel
> SELECT SUBSTRING(my_field FROM E'\\y[A-Fa-f0-9]+\\y') ← Get first isolated hex-string
FROM my_table; occurring in a field:
> SELECT REGEXP_SPLIT_TO_TABLE('The quick brown fox', E'\\s+'); ← split string based on regex
return each fragment in a row
│ column │
├────────┤
│ The │
│ quick │
│ brown │
│ fox │
> SELECT
REGEXP_MATCHES(my_string, E'\\y[a-z]{10,}\\y', 'gi') ← 'gi' flags:
FROM my_table; └──────┬───────┘ g: All matches (vs just first match)
│ i: Case insensitive
└───────────────────── word with 10+ letters
└─────────┬───────────┘
└────────────────── find all words (case─insensitive) in my_string
with at least 10 letters: