同步操作将从 Hutool/elasticsearch-definitive-guide-cn 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
Without an fsync
to flush data in the file-system cache to disk, we cannot
be sure that the data will still be there after a power failure, or even after
exiting the application normally. For Elasticsearch to be reliable, it needs
to ensure that changes are persisted to disk.
In [dynamic-indices] we said that a full commit flushes segments to disk and writes a commit point, which lists all known segments. Elasticsearch uses this commit point during startup or when reopening an index to decide which segments belong to the current shard.
While we refresh once every second to achieve near real-time search, we still need to do full commits regularly to make sure that we can recover from failure. But what about the document changes that happen between commits? We don’t want to lose those either.
Elasticsearch added a translog or transaction log, which records every operation in Elasticsearch as it happens. With the translog, the process now looks like this:
When a document is indexed, it is added to the in-memory buffer AND appended to the translog.
Once every second, the shard is refreshed:
The docs in the in-memory buffer are written to a new segment,
without an fsync
.
The segment is opened to make it visible to search.
The in-memory buffer is cleared.
This process continues with more documents being added to the in-memory buffer and appended to the transaction log.
Every so often — such as when the translog is getting too big — the index is flushed: a new translog is created and a full commit is performed:
Any docs in the in-memory buffer are written to a new segment.
The buffer is cleared.
A commit point is written to disk.
The file-system cache is flushed with an fsync
.
The old translog is deleted.
The translog provides a persistent record of all operations that have not yet been flushed to disk. When starting up, Elasticsearch will use the last commit point to recover known segments from disk, and will then replay all operations in the translog to add the changes that happened after the last commit.
The translog is also used to provide real-time CRUD. When you try to retrieve, update, or delete a document by ID, it first checks the translog for any recent changes before trying to retrieve the document from the relevant segment. This means that it always has access to the latest known version of the document, in real-time.
flush
APIThe action of performing a commit and truncating the translog is known in
Elasticsearch as a flush. Shards are flushed automatically every 30
minutes, or when the translog becomes too big. See the
{ref}index-modules-translog.html[translog
documentation] for settings
that can be used to control these thresholds.
The {ref}indices-flush.html[flush
API] can be used to perform a manual flush:
POST /blogs/_flush (1)
POST /_flush?wait_for_ongoing (2)
Flush the blogs
index.
Flush all indices and wait until all flushes have completed before returning.
You seldom need to issue a manual flush
yourself — usually automatic
flushing is all that is required.
That said, it is beneficial to flush your indices before restarting a node or {ref}indices-open-close.html[closing an index]. When Elasticsearch tries to recover or reopen an index, it has to replays all of the operations in the translog, so the shorter the log, the faster the recovery.
The purpose of the translog is to ensure that operations are not lost. This begs the question: how safe is the translog?
Writes to a file will not survive a reboot until the file has been fsync'ed to disk. By default, the translog is fsync'ed every 5 seconds. Potentially, we could lose 5 seconds worth of data… if the translog were the only mechanism that we had for dealing with failure.
Fortunately, the translog is only part of a much bigger system. Remember that an indexing request is only considered to be successful once it has completed on both the primary shard and all replica shards. Even were the node holding the primary shard to suffer catastrophic failure, it would be unlikely to affect the nodes holding the replica shards at the same time.
While we could force the translog to fsync
more frequently (at the cost of
indexing performance), it is unlikely to provide more reliability.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。