add vendoring with go dep

2025-12-19 05:29:21 +00:00 · 2017-10-25 20:52:40 +00:00
parent 704f4d20d1
commit a59409f16b
1627 changed files with 489673 additions and 0 deletions
--- a/vendor/github.com/influxdata/influxdb/tsdb/index/tsi1/doc.go
+++ b/vendor/github.com/influxdata/influxdb/tsdb/index/tsi1/doc.go
@@ -0,0 +1,238 @@
+/*
+
+Package tsi1 provides a memory-mapped index implementation that supports
+high cardinality series.
+
+Overview
+
+The top-level object in tsi1 is the Index. It is the primary access point from
+the rest of the system. The Index is composed of LogFile and IndexFile objects.
+
+Log files are small write-ahead log files that record new series immediately
+in the order that they are received. The data within the file is indexed
+in-memory so it can be quickly accessed. When the system is restarted, this log
+file is replayed and the in-memory representation is rebuilt.
+
+Index files also contain series information, however, they are highly indexed
+so that reads can be performed quickly. Index files are built through a process
+called compaction where a log file or multiple index files are merged together.
+
+
+Operations
+
+The index can perform many tasks related to series, measurement, & tag data.
+All data is inserted by adding a series to the index. When adding a series,
+the measurement, tag keys, and tag values are all extracted and indexed
+separately.
+
+Once a series has been added, it can be removed in several ways. First, the
+individual series can be removed. Second, it can be removed as part of a bulk
+operation by deleting the entire measurement.
+
+The query engine needs to be able to look up series in a variety of ways such
+as by measurement name, by tag value, or by using regular expressions. The
+index provides an API to iterate over subsets of series and perform set
+operations such as unions and intersections.
+
+
+Log File Layout
+
+The write-ahead file that series initially are inserted into simply appends
+all new operations sequentially. It is simply composed of a series of log
+entries. An entry contains a flag to specify the operation type, the measurement
+name, the tag set, and a checksum.
+
+	┏━━━━━━━━━LogEntry━━━━━━━━━┓
+	┃ ┌──────────────────────┐ ┃
+	┃ │         Flag         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │     Measurement      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Key/Value       │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Key/Value       │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Key/Value       │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │       Checksum       │ ┃
+	┃ └──────────────────────┘ ┃
+	┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
+
+When the log file is replayed, if the checksum is incorrect or the entry is
+incomplete (because of a partially failed write) then the log is truncated.
+
+
+Index File Layout
+
+The index file is composed of 3 main block types: one series block, one or more
+tag blocks, and one measurement block. At the end of the index file is a
+trailer that records metadata such as the offsets to these blocks.
+
+
+Series Block Layout
+
+The series block stores raw series keys in sorted order. It also provides hash
+indexes so that series can be looked up quickly. Hash indexes are inserted
+periodically so that memory size is limited at write time. Once all the series
+and hash indexes have been written then a list of index entries are written
+so that hash indexes can be looked up via binary search.
+
+The end of the block contains two HyperLogLog++ sketches which track the
+estimated number of created series and deleted series. After the sketches is
+a trailer which contains metadata about the block.
+
+	┏━━━━━━━SeriesBlock━━━━━━━━┓
+	┃ ┌──────────────────────┐ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │      Series Key      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │    Index Entries     │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │     HLL Sketches     │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │       Trailer        │ ┃
+	┃ └──────────────────────┘ ┃
+	┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
+
+
+Tag Block Layout
+
+After the series block is one or more tag blocks. One of these blocks exists
+for every measurement in the index file. The block is structured as a sorted
+list of values for each key and then a sorted list of keys. Each of these lists
+has their own hash index for fast direct lookups.
+
+	┏━━━━━━━━Tag Block━━━━━━━━━┓
+	┃ ┌──────────────────────┐ ┃
+	┃ │        Value         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │        Value         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │        Value         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ └──────────────────────┘ ┃
+	┃ ┌──────────────────────┐ ┃
+	┃ │        Value         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │        Value         │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ └──────────────────────┘ ┃
+	┃ ┌──────────────────────┐ ┃
+	┃ │         Key          │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │         Key          │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ └──────────────────────┘ ┃
+	┃ ┌──────────────────────┐ ┃
+	┃ │       Trailer        │ ┃
+	┃ └──────────────────────┘ ┃
+	┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
+
+Each entry for values contains a sorted list of offsets for series keys that use
+that value. Series iterators can be built around a single tag key value or
+multiple iterators can be merged with set operators such as union or
+intersection.
+
+
+Measurement block
+
+The measurement block stores a sorted list of measurements, their associated
+series offsets, and the offset to their tag block. This allows all series for
+a measurement to be traversed quickly and it allows fast direct lookups of
+measurements and their tags.
+
+This block also contains HyperLogLog++ sketches for new and deleted
+measurements.
+
+	┏━━━━Measurement Block━━━━━┓
+	┃ ┌──────────────────────┐ ┃
+	┃ │     Measurement      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │     Measurement      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │     Measurement      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │                      │ ┃
+	┃ │      Hash Index      │ ┃
+	┃ │                      │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │     HLL Sketches     │ ┃
+	┃ ├──────────────────────┤ ┃
+	┃ │       Trailer        │ ┃
+	┃ └──────────────────────┘ ┃
+	┗━━━━━━━━━━━━━━━━━━━━━━━━━━┛
+
+
+Manifest file
+
+The index is simply an ordered set of log and index files. These files can be
+merged together or rewritten but their order must always be the same. This is
+because series, measurements, & tags can be marked as deleted (aka tombstoned)
+and this action needs to be tracked in time order.
+
+Whenever the set of active files is changed, a manifest file is written to
+track the set. The manifest specifies the ordering of files and, on startup,
+all files not in the manifest are removed from the index directory.
+
+
+Compacting index files
+
+Compaction is the process of taking files and merging them together into a
+single file. There are two stages of compaction within TSI.
+
+First, once log files exceed a size threshold then they are compacted into an
+index file. This threshold is relatively small because log files must maintain
+their index in the heap which TSI tries to avoid. Small log files are also very
+quick to convert into an index file so this is done aggressively.
+
+Second, once a contiguous set of index files exceed a factor (e.g. 10x) then
+they are all merged together into a single index file and the old files are
+discarded. Because all blocks are written in sorted order, the new index file
+can be streamed and minimize memory use.
+
+
+Concurrency
+
+Index files are immutable so they do not require fine grained locks, however,
+compactions require that we track which files are in use so they are not
+discarded too soon. This is done by using reference counting with file sets.
+
+A file set is simply an ordered list of index files. When the current file set
+is obtained from the index, a counter is incremented to track its usage. Once
+the user is done with the file set, it is released and the counter is
+decremented. A file cannot be removed from the file system until this counter
+returns to zero.
+
+Besides the reference counting, there are no other locking mechanisms when
+reading or writing index files. Log files, however, do require a lock whenever
+they are accessed. This is another reason to minimize log file size.
+
+
+*/
+package tsi1