PostgreSQL的扩展(extensions)-常用的扩展-pg_dirtyread
pg_dirtyread 是 PostgreSQL 的一个特殊扩展,它允许读取已被删除但尚未被 VACUUM 清理的数据行,是数据恢复的重要工具。
原理:
pg_dirtyread 通过直接访问表的底层页面,绕过 PostgreSQL 正常的可见性规则检查:
- 读取表的物理页面数据
- 忽略 xmax 标记(删除事务ID)
- 返回所有行版本,包括被删除的行
时间窗口限制:
- 只能读取尚未被 VACUUM 清理的数据
- 常规表:通常保留几小时到几天
- 频繁更新的表:保留时间更短
不支持的场景:
- TRUNCATE 操作删除的数据
- DROP TABLE 删除的表
- 已执行 VACUUM FULL 的表
一 下载并编译安装
1.1 下载
下载网址:
https://github.com/df7cb/pg_dirtyread/tags
1.2 编译安装
make
make install
1.3 创建 pg_dirtyread
–修改postgresql.conf文件
shared_preload_libraries = 'pg_stat_kcache,pg_stat_statements,auto_explain,pg_dirtyread' # (change requires restart)
–创建extension
white=# create extension pg_dirtyread;
CREATE EXTENSION
white=#
white=# select * from pg_EXTENSION;oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-------+--------------------+----------+--------------+----------------+------------+-----------+--------------14270 | plpgsql | 10 | 11 | f | 1.0 | | 17620 | pg_repack | 10 | 2200 | f | 1.5.0 | | 17659 | pg_stat_statements | 10 | 2200 | t | 1.10 | | 17739 | pgstattuple | 10 | 2200 | t | 1.5 | | 17840 | pg_bulkload | 10 | 2200 | f | 3.1.21 | | 17861 | pg_dirtyread | 10 | 2200 | t | 2 | |
(6 rows)
二 测试
2.1 测试一:先delete,再关闭表的autovacuum。(找回失败)
white=# select count(*) from yewu1.t3;count
-------100
(1 row)white=#
white=# delete from yewu1.t3 where id >10;
DELETE 90
white=#
white=# select count(*) from yewu1.t3;count
-------10
(1 row)white=# ALTER TABLE yewu1.t3 SET (
white(# autovacuum_enabled = false, toast.autovacuum_enabled = false
white(# );
ALTER TABLE
white=#
white=# SELECT * FROM pg_dirtyread('yewu1.t3') as t(id int, name varchar(20));id | name
----+---------1 | haha_12 | haha_23 | haha_34 | haha_45 | haha_56 | haha_67 | haha_78 | haha_89 | haha_910 | haha_10
(10 rows)white=#
2.2 测试二:先关闭表的autovacuum,再delete。(找回成功)
white=# select count(*) from yewu1.t3;count
-------100
(1 row)white=# ALTER TABLE yewu1.t3 SET (
white(# autovacuum_enabled = false, toast.autovacuum_enabled = false
white(# );
ALTER TABLE
white=#
white=# delete from yewu1.t3 where id > 10;
DELETE 90
white=#
white=# select count(*) from yewu1.t3;count
-------10
(1 row)white=#
white=# SELECT * FROM pg_dirtyread('yewu1.t3') as t(id int, name varchar(20));id | name
-----+----------1 | haha_12 | haha_23 | haha_34 | haha_45 | haha_5
。。。省略。。。96 | haha_9697 | haha_9798 | haha_9899 | haha_99100 | haha_100
(100 rows)white=#
white=#
查看autovacuum默认配置
在默认配置下,表发生较小的变化就会触发autovacuum,进而影响pg_dirtyread,减少了其可用性。
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#log_autovacuum_min_duration = 10min # log autovacuum activity;
#autovacuum = on # Enable autovacuum subprocess? 'on'
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
#autovacuum_naptime = 1min # time between autovacuum runs
#autovacuum_vacuum_threshold = 50 # min number of row updates before
#autovacuum_vacuum_insert_threshold = 1000 # min number of row inserts
#autovacuum_analyze_threshold = 50 # min number of row updates before
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
#autovacuum_vacuum_insert_scale_factor = 0.2 # fraction of inserts over table
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
#autovacuum_multixact_freeze_max_age = 400000000 # maximum multixact age
#autovacuum_vacuum_cost_delay = 2ms # default vacuum cost delay for# autovacuum, in milliseconds;
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for# autovacuum, -1 means use
谨记:心存敬畏,行有所止。