mirror of
https://github.com/nikdoof/vsphere-influxdb-go.git
synced 2025-12-19 05:29:21 +00:00
add vendoring with go dep
This commit is contained in:
238
vendor/github.com/influxdata/influxdb/tsdb/index/tsi1/doc.go
generated
vendored
Normal file
238
vendor/github.com/influxdata/influxdb/tsdb/index/tsi1/doc.go
generated
vendored
Normal file
@@ -0,0 +1,238 @@
|
||||
/*
|
||||
|
||||
Package tsi1 provides a memory-mapped index implementation that supports
|
||||
high cardinality series.
|
||||
|
||||
Overview
|
||||
|
||||
The top-level object in tsi1 is the Index. It is the primary access point from
|
||||
the rest of the system. The Index is composed of LogFile and IndexFile objects.
|
||||
|
||||
Log files are small write-ahead log files that record new series immediately
|
||||
in the order that they are received. The data within the file is indexed
|
||||
in-memory so it can be quickly accessed. When the system is restarted, this log
|
||||
file is replayed and the in-memory representation is rebuilt.
|
||||
|
||||
Index files also contain series information, however, they are highly indexed
|
||||
so that reads can be performed quickly. Index files are built through a process
|
||||
called compaction where a log file or multiple index files are merged together.
|
||||
|
||||
|
||||
Operations
|
||||
|
||||
The index can perform many tasks related to series, measurement, & tag data.
|
||||
All data is inserted by adding a series to the index. When adding a series,
|
||||
the measurement, tag keys, and tag values are all extracted and indexed
|
||||
separately.
|
||||
|
||||
Once a series has been added, it can be removed in several ways. First, the
|
||||
individual series can be removed. Second, it can be removed as part of a bulk
|
||||
operation by deleting the entire measurement.
|
||||
|
||||
The query engine needs to be able to look up series in a variety of ways such
|
||||
as by measurement name, by tag value, or by using regular expressions. The
|
||||
index provides an API to iterate over subsets of series and perform set
|
||||
operations such as unions and intersections.
|
||||
|
||||
|
||||
Log File Layout
|
||||
|
||||
The write-ahead file that series initially are inserted into simply appends
|
||||
all new operations sequentially. It is simply composed of a series of log
|
||||
entries. An entry contains a flag to specify the operation type, the measurement
|
||||
name, the tag set, and a checksum.
|
||||
|
||||
┏━━━━━━━━━LogEntry━━━━━━━━━┓
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Flag │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Measurement │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Key/Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Key/Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Key/Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Checksum │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
|
||||
|
||||
When the log file is replayed, if the checksum is incorrect or the entry is
|
||||
incomplete (because of a partially failed write) then the log is truncated.
|
||||
|
||||
|
||||
Index File Layout
|
||||
|
||||
The index file is composed of 3 main block types: one series block, one or more
|
||||
tag blocks, and one measurement block. At the end of the index file is a
|
||||
trailer that records metadata such as the offsets to these blocks.
|
||||
|
||||
|
||||
Series Block Layout
|
||||
|
||||
The series block stores raw series keys in sorted order. It also provides hash
|
||||
indexes so that series can be looked up quickly. Hash indexes are inserted
|
||||
periodically so that memory size is limited at write time. Once all the series
|
||||
and hash indexes have been written then a list of index entries are written
|
||||
so that hash indexes can be looked up via binary search.
|
||||
|
||||
The end of the block contains two HyperLogLog++ sketches which track the
|
||||
estimated number of created series and deleted series. After the sketches is
|
||||
a trailer which contains metadata about the block.
|
||||
|
||||
┏━━━━━━━SeriesBlock━━━━━━━━┓
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Series Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Index Entries │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ HLL Sketches │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Trailer │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
|
||||
|
||||
|
||||
Tag Block Layout
|
||||
|
||||
After the series block is one or more tag blocks. One of these blocks exists
|
||||
for every measurement in the index file. The block is structured as a sorted
|
||||
list of values for each key and then a sorted list of keys. Each of these lists
|
||||
has their own hash index for fast direct lookups.
|
||||
|
||||
┏━━━━━━━━Tag Block━━━━━━━━━┓
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Value │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Key │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Trailer │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
|
||||
|
||||
Each entry for values contains a sorted list of offsets for series keys that use
|
||||
that value. Series iterators can be built around a single tag key value or
|
||||
multiple iterators can be merged with set operators such as union or
|
||||
intersection.
|
||||
|
||||
|
||||
Measurement block
|
||||
|
||||
The measurement block stores a sorted list of measurements, their associated
|
||||
series offsets, and the offset to their tag block. This allows all series for
|
||||
a measurement to be traversed quickly and it allows fast direct lookups of
|
||||
measurements and their tags.
|
||||
|
||||
This block also contains HyperLogLog++ sketches for new and deleted
|
||||
measurements.
|
||||
|
||||
┏━━━━Measurement Block━━━━━┓
|
||||
┃ ┌──────────────────────┐ ┃
|
||||
┃ │ Measurement │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Measurement │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Measurement │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ │ ┃
|
||||
┃ │ Hash Index │ ┃
|
||||
┃ │ │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ HLL Sketches │ ┃
|
||||
┃ ├──────────────────────┤ ┃
|
||||
┃ │ Trailer │ ┃
|
||||
┃ └──────────────────────┘ ┃
|
||||
┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
|
||||
|
||||
|
||||
Manifest file
|
||||
|
||||
The index is simply an ordered set of log and index files. These files can be
|
||||
merged together or rewritten but their order must always be the same. This is
|
||||
because series, measurements, & tags can be marked as deleted (aka tombstoned)
|
||||
and this action needs to be tracked in time order.
|
||||
|
||||
Whenever the set of active files is changed, a manifest file is written to
|
||||
track the set. The manifest specifies the ordering of files and, on startup,
|
||||
all files not in the manifest are removed from the index directory.
|
||||
|
||||
|
||||
Compacting index files
|
||||
|
||||
Compaction is the process of taking files and merging them together into a
|
||||
single file. There are two stages of compaction within TSI.
|
||||
|
||||
First, once log files exceed a size threshold then they are compacted into an
|
||||
index file. This threshold is relatively small because log files must maintain
|
||||
their index in the heap which TSI tries to avoid. Small log files are also very
|
||||
quick to convert into an index file so this is done aggressively.
|
||||
|
||||
Second, once a contiguous set of index files exceed a factor (e.g. 10x) then
|
||||
they are all merged together into a single index file and the old files are
|
||||
discarded. Because all blocks are written in sorted order, the new index file
|
||||
can be streamed and minimize memory use.
|
||||
|
||||
|
||||
Concurrency
|
||||
|
||||
Index files are immutable so they do not require fine grained locks, however,
|
||||
compactions require that we track which files are in use so they are not
|
||||
discarded too soon. This is done by using reference counting with file sets.
|
||||
|
||||
A file set is simply an ordered list of index files. When the current file set
|
||||
is obtained from the index, a counter is incremented to track its usage. Once
|
||||
the user is done with the file set, it is released and the counter is
|
||||
decremented. A file cannot be removed from the file system until this counter
|
||||
returns to zero.
|
||||
|
||||
Besides the reference counting, there are no other locking mechanisms when
|
||||
reading or writing index files. Log files, however, do require a lock whenever
|
||||
they are accessed. This is another reason to minimize log file size.
|
||||
|
||||
|
||||
*/
|
||||
package tsi1
|
||||
Reference in New Issue
Block a user