網站首頁 編程語言 正文
寫在前面
以下內容是基于Redis 6.2.6 版本整理總結
一、Redis為什么要持久化
Redis 是一個內存數據庫,就是將數據庫中的內容保存在內存中,這與傳統的MySQL,Oracle等關系型數據庫直接將內容保存到硬盤中相比,內存數據庫的讀寫效率比傳統數據庫要快的多(內存的讀寫效率遠遠大于硬盤的讀寫效率)。但是內存中存儲的缺點就是,一旦斷電或者宕機,那么內存數據庫中的數據將會全部丟失。而且,有時候redis需要重啟,要加載回原來的狀態,也需要持久化重啟之前的狀態。
為了解決這個缺點,Redis提供了將內存數據持久化到硬盤,以及用持久化文件來恢復數據庫數據的功能。Redis 支持兩種形式的持久化,一種是RDB快照(snapshotting),另外一種是AOF(append-only-file)。從Redis4.0版本開始還通過RDB和AOF的混合持久化。
二、Redis的持久化方式
2.1. AOF持久化(Append of file)
OF采用的就是順序追加的方式,對于磁盤來說,順序寫是最快、最友好的方式。AOF文件存儲的是redis命令協議格式的數據。Redis通過重放AOF文件,也就是執行AOF文件里的命令,來恢復數據。
2.1.1 fsync 系統調用
fsync 是系統調動。內核自己的機制,調用fysnc把數據從內核緩沖區刷到磁盤。如果想主動刷盤,就write完調用一次fysnc。
2.1.2 AOF持久化策略
- always 在主線程中執行,每次增刪改操作,都要調用fsync 落盤,數據最安全,但效率最低
- every second 在后臺線程(bio_fsync_aof)中執行,會丟1~2s的數據
- no 由操作系統決定什么時候刷盤,不可控
缺點:
對數據庫所有的修改命令(增刪改)都會記錄到AOF文件,數據冗余,隨著運行時間增加,AOF文件會太過龐大,導致恢復速度變慢。比如:set key v1 ,set key v2 ,del key , set key v3,這四條命令都會被記錄。但最終的狀態就是key == v3,其余的命令就是冗余的數據。也就是說,我們只需要最后一個狀態即可。
2.1.3 aof_rewrite
redis針對AOF文件過大的問題,推出了aof_rewrite來優化。aof_rewrite 原理:通過 fork 進程,在子進程中根據當前內存中的數據狀態,生成命令協議數據,也就是最新的狀態保存到aof文件,避免同一個key的歷史數據冗余,提升恢復速度。
在重寫aof期間,redis的主進程還在繼續響應客戶端的請求,redis會將寫請求寫到重寫的緩沖區,等到子進程aof持久化結束,給主進程發信號,主進程再將重寫緩沖區的數據追加到新的aof文件中。
雖然rewrite后AOF文件會變小,但aof還是要通過重放的方式恢復數據,需要耗費cpu資源,比較慢。
2.2 RDB快照(redis默認持久化方式)
RDB是把當前內存中的數據集快照寫入磁盤RDB文件,也就是 Snapshot 快照(數據庫中所有鍵值對二進制數據)。恢復時是將快照文件直接讀到內存里。也是通過fork出子進程去持久化。Redis沒有專門的載入RDB文件的命令,Redis服務器會在啟動時,如果檢測到了RDB文件就會自動載入RDB文件。
觸發方式(自動觸發和非自動觸發)
(1)自動觸發
在redis.conf 文件中,SNAPSHOTTING 的配置選項就是用來配置自動觸發條件。
save: 用來配置RDB持久化觸發的條件。save m n 表示 m 秒內,數據存在n次修改時,自動觸發 bgsave (后臺持久化)。
save “” 表示禁用快照;
save 900 1:表示900 秒內如果至少有 1 個 key 的值變化,則保存;
save 300 10:表示300 秒內如果至少有 10 個 key 的值變化,則保存;
save 60 10000:表示60 秒內如果至少有 10000 個 key 的值變化,則保存。
如果你只需要使用Redis的緩存功能,不需要持久化,只需要注釋掉所有的save行即可。
stop-writes-on-bgsave-error: 默認值為 yes。如果RDB快照開啟,并且最近的一次快照保存失敗了,Redis會拒絕接收更新操作,以此來提醒用戶數據持久化失敗了,否則這些更新的數據可能會丟失。
rdbcompression:是否啟用RDB快照文件壓縮存儲,默認是開啟的,當數據量特別大時,壓縮可以節省硬盤空間,但是會增加CPU消耗,可以選擇關閉來節省CPU資源,建議開啟。
rdbchecksum:文件校驗,默認開啟。在Redis 5.0版本后,新增了校驗功能,用于保證文件的完整性。開啟這個選項會增加10%左右的性能損耗,如果追求高性能,可以關閉該選項。
dbfilename :RDB文件名,默認為 dump.rdb
rdb-del-sync-files: Redis主從全量同步時,通過RDB文件傳輸實現。如果沒有開啟持久化,同步完成后,是否要移除主從同步的RDB文件,默認為no。
dir:存放RDB和AOF持久化文件的目錄 默認為當前目錄
(2)手動觸發
Redis手動觸發RDB持久化的命令有兩種:
1)save :該命令會阻塞Redis主進程,在save持久化期間,Redis不能響應處理其他命令,這段時間Redis不可用,可能造成業務的停擺,直至RDB過程完成。一般不用。
2)bgsave:會在主進程fork出子進程進行RDB的持久化。阻塞只發生在fork階段,而大key會導致fork時間增長。
2.3 RDB和AOF混用
RDB借鑒了aof_rewrite的思路,就是rbd文件寫完,再把重寫緩沖區的數據,追加到rbd文件的末尾,追加的這部分數據的格式是AOF的命令格式,這就是rdb_aof的混用。
2.4 三種持久化方式比較
- AOF 優點:數據可靠,丟失少;缺點:AOF 文件大,恢復速度慢;
- RDB 優點:RDB文件體積小,數據恢復快。缺點:無法做到實時/秒級持久化,會丟失最后一次快照后的所有數據。每次bgsave運行都需要fork進程,主進程和子進程共享一份內存空間,主進程在繼續處理客戶端命令時,采用的時寫時復制技術,只有修改的那部分內存會重新復制出一份,更新頁表指向。復制出的那部分,會導致內存膨脹。具體膨脹的程度,取決于主進程修改的比例有多大。注意:子進程只是讀取數據,并不修改內存中的數據。
三、什么是大key以及大key對持久化的影響
3.1 什么是大key
redis 是kv 中的v站用了大量的空間。比如當v的類型是hash、zset,并且里面存儲了大量的元素,這個v對應的key就是大key。
3.2 fork進程寫時復制原理
在Redis主進程中調用fork()函數,創建出子進程。這個子進程在fork()函數返回時,跟主進程的狀態是一模一樣的。包括mm_struct和頁表。此時,他們的頁表都被標記為私有的寫時復制狀態(只讀狀態)。當某個進程試圖寫某個數據頁時,會觸發寫保護,內核會重新為該進程映射一段內存,供其讀寫,并將頁表指向這個新的數據頁。
3.3 面試題-大key對持久化有什么影響
結合不同的持久化方式回答。fsync壓力大,fork時間長。
如果是AOF:always、every second、no aof_rewrite
如果是RDB: rdb_aof
fork是在主進程中執行的,如果fork慢,會影響到主進程的響應。
四、持久化源碼分析
4.1 RDB持久化
4.1.1 RDB文件的創建
Redis是通過rdbSave函數來創建RDB文件的,SAVE 和 BGSAVE 會以不同的方式去調用rdbSave。
// src/rdb.c
/* Save the DB on disk. Return C_ERR on error, C_OK on success. */
int rdbSave(char *filename, rdbSaveInfo *rsi) {
char tmpfile[256];
char cwd[MAXPATHLEN]; /* Current working dir path for error messages. */
FILE *fp = NULL;
rio rdb;
int error = 0;
snprintf(tmpfile,256,"temp-%d.rdb", (int) getpid());
fp = fopen(tmpfile,"w");
if (!fp) {
char *cwdp = getcwd(cwd,MAXPATHLEN);
serverLog(LL_WARNING,
"Failed opening the RDB file %s (in server root dir %s) "
"for saving: %s",
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
return C_ERR;
}
rioInitWithFile(&rdb,fp);
startSaving(RDBFLAGS_NONE);
if (server.rdb_save_incremental_fsync)
rioSetAutoSync(&rdb,REDIS_AUTOSYNC_BYTES);
if (rdbSaveRio(&rdb,&error,RDBFLAGS_NONE,rsi) == C_ERR) {
errno = error;
goto werr;
}
/* Make sure data will not remain on the OS's output buffers */
if (fflush(fp)) goto werr;
if (fsync(fileno(fp))) goto werr;
if (fclose(fp)) { fp = NULL; goto werr; }
fp = NULL;
/* Use RENAME to make sure the DB file is changed atomically only
* if the generate DB file is ok. */
if (rename(tmpfile,filename) == -1) {
char *cwdp = getcwd(cwd,MAXPATHLEN);
serverLog(LL_WARNING,
"Error moving temp DB file %s on the final "
"destination %s (in server root dir %s): %s",
tmpfile,
filename,
cwdp ? cwdp : "unknown",
strerror(errno));
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
serverLog(LL_NOTICE,"DB saved on disk");
server.dirty = 0;
server.lastsave = time(NULL);
server.lastbgsave_status = C_OK;
stopSaving(1);
return C_OK;
werr:
serverLog(LL_WARNING,"Write error saving DB on disk: %s", strerror(errno));
if (fp) fclose(fp);
unlink(tmpfile);
stopSaving(0);
return C_ERR;
}
SAVE命令,在Redis主線程中執行,如果save時間太長會影響Redis的性能。
void saveCommand(client *c) {
// 如果已經有子進程在進行RDB持久化
if (server.child_type == CHILD_TYPE_RDB) {
addReplyError(c,"Background save already in progress");
return;
}
rdbSaveInfo rsi, *rsiptr;
rsiptr = rdbPopulateSaveInfo(&rsi);
// 持久化
if (rdbSave(server.rdb_filename,rsiptr) == C_OK) {
addReply(c,shared.ok);
} else {
addReplyErrorObject(c,shared.err);
}
}
BGSAVE命令是通過執行rdbSaveBackground函數,可以看到rdbSave的調用時在子進程中。在BGSAVE執行期間,客戶端發送的SAVE命令會被拒絕,禁止SAVE和BGSAVE同時執行,主要時為了防止主進程和子進程同時執行rdbSave,產生競爭;同理,也不能同時執行兩個BGSAVE,也會產生競爭條件。
/* BGSAVE [SCHEDULE] */
void bgsaveCommand(client *c) {
int schedule = 0;
/* The SCHEDULE option changes the behavior of BGSAVE when an AOF rewrite
* is in progress. Instead of returning an error a BGSAVE gets scheduled. */
if (c->argc > 1) {
if (c->argc == 2 && !strcasecmp(c->argv[1]->ptr,"schedule")) {
schedule = 1;
} else {
addReplyErrorObject(c,shared.syntaxerr);
return;
}
}
rdbSaveInfo rsi, *rsiptr;
rsiptr = rdbPopulateSaveInfo(&rsi);
if (server.child_type == CHILD_TYPE_RDB) {
addReplyError(c,"Background save already in progress");
} else if (hasActiveChildProcess()) {
if (schedule) {
server.rdb_bgsave_scheduled = 1;
addReplyStatus(c,"Background saving scheduled");
} else {
addReplyError(c,
"Another child process is active (AOF?): can't BGSAVE right now. "
"Use BGSAVE SCHEDULE in order to schedule a BGSAVE whenever "
"possible.");
}
} else if (rdbSaveBackground(server.rdb_filename,rsiptr) == C_OK) {
addReplyStatus(c,"Background saving started");
} else {
addReplyErrorObject(c,shared.err);
}
}
int rdbSaveBackground(char *filename, rdbSaveInfo *rsi) {
pid_t childpid;
if (hasActiveChildProcess()) return C_ERR;
server.dirty_before_bgsave = server.dirty;
server.lastbgsave_try = time(NULL);
// 子進程
if ((childpid = redisFork(CHILD_TYPE_RDB)) == 0) {
int retval;
/* Child */
redisSetProcTitle("redis-rdb-bgsave");
redisSetCpuAffinity(server.bgsave_cpulist);
retval = rdbSave(filename,rsi);
if (retval == C_OK) {
sendChildCowInfo(CHILD_INFO_TYPE_RDB_COW_SIZE, "RDB");
}
exitFromChild((retval == C_OK) ? 0 : 1);
} else {
/* Parent */
if (childpid == -1) {
server.lastbgsave_status = C_ERR;
serverLog(LL_WARNING,"Can't save in background: fork: %s",
strerror(errno));
return C_ERR;
}
serverLog(LL_NOTICE,"Background saving started by pid %ld",(long) childpid);
server.rdb_save_time_start = time(NULL);
server.rdb_child_type = RDB_CHILD_TYPE_DISK;
return C_OK;
}
return C_OK; /* unreached */
}
4.1.2 RDB文件的載入
Redis通過rdbLoad函數完成RDB文件的載入工作。Redis服務器在RDB的載入過程中會一直阻塞,直到完成加載。
int rdbLoad(char *filename, rdbSaveInfo *rsi, int rdbflags) {
FILE *fp;
rio rdb;
int retval;
if ((fp = fopen(filename,"r")) == NULL) return C_ERR;
startLoadingFile(fp, filename,rdbflags);
rioInitWithFile(&rdb,fp);
retval = rdbLoadRio(&rdb,rdbflags,rsi);
fclose(fp);
stopLoading(retval==C_OK);
return retval;
}
4.2 AOF持久化
4.2.1 AOF持久化實現
- AOF命令追加:當Redis服務器執行完一個寫命令后,會將該命令以協議格式追加到aof_buf緩沖區的末尾
- AOF文件的寫入和同步:Redis服務是單線程的,主要在一個事件循環(event loop)中循環。Redis中事件分為文件事件和時間事件,文件事件負責接收客戶端的命令請求和給客戶端回復數據,時間事件負責執行定時任務。在一次的事件循環結束之前,都會調用flushAppendOnlyFile函數,該函數會根據redis.conf配置文件中的持久化策略決定何時將aof_buf緩沖區中的命令數據寫入的AOF文件。
4.2.2 源碼分析
// src/server.h
/* Append only defines */
#define AOF_FSYNC_NO 0
#define AOF_FSYNC_ALWAYS 1
#define AOF_FSYNC_EVERYSEC 2
// src/aof.c
void flushAppendOnlyFile(int force) {
ssize_t nwritten;
int sync_in_progress = 0;
mstime_t latency;
// 如果當前aof_buf緩沖區為空
if (sdslen(server.aof_buf) == 0) {
/* Check if we need to do fsync even the aof buffer is empty,
* because previously in AOF_FSYNC_EVERYSEC mode, fsync is
* called only when aof buffer is not empty, so if users
* stop write commands before fsync called in one second,
* the data in page cache cannot be flushed in time. */
if (server.aof_fsync == AOF_FSYNC_EVERYSEC &&
server.aof_fsync_offset != server.aof_current_size &&
server.unixtime > server.aof_last_fsync &&
!(sync_in_progress = aofFsyncInProgress())) {
goto try_fsync;
} else {
return;
}
}
if (server.aof_fsync == AOF_FSYNC_EVERYSEC)
sync_in_progress = aofFsyncInProgress();
if (server.aof_fsync == AOF_FSYNC_EVERYSEC && !force) {
/* With this append fsync policy we do background fsyncing.
* If the fsync is still in progress we can try to delay
* the write for a couple of seconds. */
if (sync_in_progress) {
if (server.aof_flush_postponed_start == 0) {
/* No previous write postponing, remember that we are
* postponing the flush and return. */
server.aof_flush_postponed_start = server.unixtime;
return;
} else if (server.unixtime - server.aof_flush_postponed_start < 2) {
/* We were already waiting for fsync to finish, but for less
* than two seconds this is still ok. Postpone again. */
return;
}
/* Otherwise fall trough, and go write since we can't wait
* over two seconds. */
server.aof_delayed_fsync++;
serverLog(LL_NOTICE,"Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.");
}
}
/* We want to perform a single write. This should be guaranteed atomic
* at least if the filesystem we are writing is a real physical one.
* While this will save us against the server being killed I don't think
* there is much to do about the whole server stopping for power problems
* or alike */
if (server.aof_flush_sleep && sdslen(server.aof_buf)) {
usleep(server.aof_flush_sleep);
}
latencyStartMonitor(latency);
nwritten = aofWrite(server.aof_fd,server.aof_buf,sdslen(server.aof_buf));
latencyEndMonitor(latency);
/* We want to capture different events for delayed writes:
* when the delay happens with a pending fsync, or with a saving child
* active, and when the above two conditions are missing.
* We also use an additional event name to save all samples which is
* useful for graphing / monitoring purposes. */
if (sync_in_progress) {
latencyAddSampleIfNeeded("aof-write-pending-fsync",latency);
} else if (hasActiveChildProcess()) {
latencyAddSampleIfNeeded("aof-write-active-child",latency);
} else {
latencyAddSampleIfNeeded("aof-write-alone",latency);
}
latencyAddSampleIfNeeded("aof-write",latency);
/* We performed the write so reset the postponed flush sentinel to zero. */
server.aof_flush_postponed_start = 0;
if (nwritten != (ssize_t)sdslen(server.aof_buf)) {
static time_t last_write_error_log = 0;
int can_log = 0;
/* Limit logging rate to 1 line per AOF_WRITE_LOG_ERROR_RATE seconds. */
if ((server.unixtime - last_write_error_log) > AOF_WRITE_LOG_ERROR_RATE) {
can_log = 1;
last_write_error_log = server.unixtime;
}
/* Log the AOF write error and record the error code. */
if (nwritten == -1) {
if (can_log) {
serverLog(LL_WARNING,"Error writing to the AOF file: %s",
strerror(errno));
server.aof_last_write_errno = errno;
}
} else {
if (can_log) {
serverLog(LL_WARNING,"Short write while writing to "
"the AOF file: (nwritten=%lld, "
"expected=%lld)",
(long long)nwritten,
(long long)sdslen(server.aof_buf));
}
if (ftruncate(server.aof_fd, server.aof_current_size) == -1) {
if (can_log) {
serverLog(LL_WARNING, "Could not remove short write "
"from the append-only file. Redis may refuse "
"to load the AOF the next time it starts. "
"ftruncate: %s", strerror(errno));
}
} else {
/* If the ftruncate() succeeded we can set nwritten to
* -1 since there is no longer partial data into the AOF. */
nwritten = -1;
}
server.aof_last_write_errno = ENOSPC;
}
/* Handle the AOF write error. */
if (server.aof_fsync == AOF_FSYNC_ALWAYS) {
/* We can't recover when the fsync policy is ALWAYS since the reply
* for the client is already in the output buffers (both writes and
* reads), and the changes to the db can't be rolled back. Since we
* have a contract with the user that on acknowledged or observed
* writes are is synced on disk, we must exit. */
serverLog(LL_WARNING,"Can't recover from AOF write error when the AOF fsync policy is 'always'. Exiting...");
exit(1);
} else {
/* Recover from failed write leaving data into the buffer. However
* set an error to stop accepting writes as long as the error
* condition is not cleared. */
server.aof_last_write_status = C_ERR;
/* Trim the sds buffer if there was a partial write, and there
* was no way to undo it with ftruncate(2). */
if (nwritten > 0) {
server.aof_current_size += nwritten;
sdsrange(server.aof_buf,nwritten,-1);
}
return; /* We'll try again on the next call... */
}
} else {
/* Successful write(2). If AOF was in error state, restore the
* OK state and log the event. */
if (server.aof_last_write_status == C_ERR) {
serverLog(LL_WARNING,
"AOF write error looks solved, Redis can write again.");
server.aof_last_write_status = C_OK;
}
}
server.aof_current_size += nwritten;
/* Re-use AOF buffer when it is small enough. The maximum comes from the
* arena size of 4k minus some overhead (but is otherwise arbitrary). */
if ((sdslen(server.aof_buf)+sdsavail(server.aof_buf)) < 4000) {
sdsclear(server.aof_buf);
} else {
sdsfree(server.aof_buf);
server.aof_buf = sdsempty();
}
try_fsync:
/* Don't fsync if no-appendfsync-on-rewrite is set to yes and there are
* children doing I/O in the background. */
if (server.aof_no_fsync_on_rewrite && hasActiveChildProcess())
return;
/* Perform the fsync if needed. */
if (server.aof_fsync == AOF_FSYNC_ALWAYS) {
/* redis_fsync is defined as fdatasync() for Linux in order to avoid
* flushing metadata. */
latencyStartMonitor(latency);
/* Let's try to get this data on the disk. To guarantee data safe when
* the AOF fsync policy is 'always', we should exit if failed to fsync
* AOF (see comment next to the exit(1) after write error above). */
if (redis_fsync(server.aof_fd) == -1) {
serverLog(LL_WARNING,"Can't persist AOF for fsync error when the "
"AOF fsync policy is 'always': %s. Exiting...", strerror(errno));
exit(1);
}
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("aof-fsync-always",latency);
server.aof_fsync_offset = server.aof_current_size;
server.aof_last_fsync = server.unixtime;
} else if ((server.aof_fsync == AOF_FSYNC_EVERYSEC &&
server.unixtime > server.aof_last_fsync)) {
if (!sync_in_progress) {
aof_background_fsync(server.aof_fd);
server.aof_fsync_offset = server.aof_current_size;
}
server.aof_last_fsync = server.unixtime;
}
}
原文鏈接:https://blog.csdn.net/weixin_46935110/article/details/127560516
相關推薦
- 2022-12-01 Python?Flask前端自動登錄功能實現詳解_python
- 2023-02-17 pytorch中nn.Flatten()函數詳解及示例_python
- 2023-01-03 Kotlin文件讀寫與SharedPreferences存儲功能實現方法_Android
- 2022-11-04 ASP.NET?MVC實現登錄后跳轉到原界面_實用技巧
- 2022-09-04 從docker鏡像里提取dockerfile的兩種方法_docker
- 2023-11-12 編譯yolov3報錯:/usr/bin/ld: cannot find -lcudnn(/usr/b
- 2022-08-23 C++?primer超詳細講解泛型算法_C 語言
- 2022-05-27 C++超詳細分析單鏈表的實現與常見接口_C 語言
- 最近更新
-
- window11 系統安裝 yarn
- 超詳細win安裝深度學習環境2025年最新版(
- Linux 中運行的top命令 怎么退出?
- MySQL 中decimal 的用法? 存儲小
- get 、set 、toString 方法的使
- @Resource和 @Autowired注解
- Java基礎操作-- 運算符,流程控制 Flo
- 1. Int 和Integer 的區別,Jav
- spring @retryable不生效的一種
- Spring Security之認證信息的處理
- Spring Security之認證過濾器
- Spring Security概述快速入門
- Spring Security之配置體系
- 【SpringBoot】SpringCache
- Spring Security之基于方法配置權
- redisson分布式鎖中waittime的設
- maven:解決release錯誤:Artif
- restTemplate使用總結
- Spring Security之安全異常處理
- MybatisPlus優雅實現加密?
- Spring ioc容器與Bean的生命周期。
- 【探索SpringCloud】服務發現-Nac
- Spring Security之基于HttpR
- Redis 底層數據結構-簡單動態字符串(SD
- arthas操作spring被代理目標對象命令
- Spring中的單例模式應用詳解
- 聊聊消息隊列,發送消息的4種方式
- bootspring第三方資源配置管理
- GIT同步修改后的遠程分支