【发布时间】:2024-01-09 22:58:01
【问题描述】:
我希望有人可以帮助我解决 WAL 运输和热待机问题。我的备用系统愉快地运行了数周,然后突然开始寻找不存在的 .history 文件。然后它崩溃了,如果不重建备用服务器,我就无法成功重新启动它。
两个系统都运行 CentOS 4.5 和 postgres 8.4.1。他们使用 NFS 在备用服务器上存储来自生产环境的 WAL 文件。
与我的 cmets 相关的日志块:
[** Recovery is running normally **]
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000001000000830000005B
WAL file path : /var/tafkan_backup_from_db1/00000001000000830000005B
Restoring to : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/00000001000000830000005B" "pg_xlog/RECOVERYXLOG"
Keep archive history : 00000001000000830000004D and later
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000001000000830000005B
WAL file path : /var/tafkan_backup_from_db1/00000001000000830000005B
Restoring to : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/00000001000000830000005B" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000000000000000000000 and later
running restore : OK
[** All of a sudden it starts looks for .history files **]
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000002.history
WAL file path : /var/tafkan_backup_from_db1/00000002.history
Restoring to : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/00000002.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore :cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
not restored
history file not found
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000001.history
WAL file path : /var/tafkan_backup_from_db1/00000001.history
Restoring to : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/00000001.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore :cp: cannot stat `/var/tafkan_backup_from_db1/00000001.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000001.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000001.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000001.history': No such file or directory
not restored
history file not found
[** I stopped Postgres, renamed recovery.done to recovery.conf, and restarted it. **]
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000002.history
WAL file path : /var/tafkan_backup_from_db1/00000002.history
Restoring to : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/00000002.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore :cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
cp: cannot stat `/var/tafkan_backup_from_db1/00000002.history': No such file or directory
not restored
history file not found
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 0000000200000083000000A2
WAL file path : /var/tafkan_backup_from_db1/0000000200000083000000A2
Restoring to : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : cp "/var/tafkan_backup_from_db1/0000000200000083000000A2" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000000000000000000000 and later
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
[** This file is not present. All WAL files start with 00000001. **]
有什么想法吗?我什至不知道 .history 文件是什么,而且(大部分优秀的)文档对此都不是很清楚。
PS。我希望我正在运行虚拟机,这样我就可以使用link text 而不必担心这些应用程序级的 HA 废话:-)
更新:以下是大约此时来自备用服务器的一些日志。看起来有些东西使服务器停止恢复并上线,但我不知道是什么。我很确定没有任何东西可以创建触发器文件。
2010-01-20 03:30:15 EST 4b3a5c63.401b LOG: restored log file "00000001000000830000005A" from archive
2010-01-20 03:30:23 EST 4b3a5c63.401b LOG: restored log file "00000001000000830000005B" from archive
2010-01-20 03:30:23 EST 4b3a5c63.401b LOG: record with zero length at 83/5BFA2FF8
2010-01-20 03:30:23 EST 4b3a5c63.401b LOG: redo done at 83/5BFA2FAC
2010-01-20 03:30:23 EST 4b3a5c63.401b LOG: last completed transaction was at log time 2010-01-20 03:28:04.594399-05
2010-01-20 03:30:25 EST 4b3a5c63.401b LOG: restored log file "00000001000000830000005B" from archive
2010-01-20 03:30:37 EST 4b3a5c63.401b LOG: selected new timeline ID: 2
2010-01-20 03:30:49 EST 4b3a5c63.401b LOG: archive recovery complete
2010-01-20 03:30:59 EST 4b3a5c62.4019 LOG: database system is ready to accept connections
【问题讨论】:
-
嗨 sbleon,我只想将 WAL 文件备份到备用位置,我不需要热备,你能帮忙吗??
-
@indyaah,查看the excellent PostgreSQL docs 的版本。
-
感谢帮助的朋友。!! :D
标签: postgresql high-availability log-shipping