minio

Author	SHA1	Message	Date
Klaus Post	6235bd825b	Grab read lock while reading usage cache (#12111 ) Signed-off-by: Klaus Post <klauspost@gmail.com>	2021-04-21 08:39:00 -07:00
Harshavardhana	abb55bd49e	fix: properly close leaking bandwidth monitor channel (#11967 ) This PR fixes - close leaking bandwidth report channel leakage - remove the closer requirement for bandwidth monitor instead if Read() fails remember the error and return error for all subsequent reads. - use locking for usage-cache.bin updates, with inline data we cannot afford to have concurrent writes to usage-cache.bin corrupting xl.meta	2021-04-05 16:07:53 -07:00
Poorna Krishnamoorthy	47c09a1e6f	Various improvements in replication (#11949 ) - collect real time replication metrics for prometheus. - add pending_count, failed_count metric for total pending/failed replication operations. - add API to get replication metrics - add MRF worker to handle spill-over replication operations - multiple issues found with replication - fixes an issue when client sends a bucket name with `/` at the end from SetRemoteTarget API call make sure to trim the bucket name to avoid any extra `/`. - hold write locks in GetObjectNInfo during replication to ensure that object version stack is not overwritten while reading the content. - add additional protection during WriteMetadata() to ensure that we always write a valid FileInfo{} and avoid ever writing empty FileInfo{} to the lowest layers. Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io> Co-authored-by: Harshavardhana <harsha@minio.io>	2021-04-03 09:03:42 -07:00
Andreas Auernhammer	d4b822d697	pkg/etag: add new package for S3 ETag handling (#11577 ) This commit adds a new package `etag` for dealing with S3 ETags. Even though ETag is often viewed as MD5 checksum of an object, handling S3 ETags correctly is a surprisingly complex task. While it is true that the ETag corresponds to the MD5 for the most basic S3 API operations, there are many exceptions in case of multipart uploads or encryption. In worse, some S3 clients expect very specific behavior when it comes to ETags. For example, some clients expect that the ETag is a double-quoted string and fail otherwise. Non-AWS compliant ETag handling has been a source of many bugs in the past. Therefore, this commit adds a dedicated `etag` package that provides functionality for parsing, generating and converting S3 ETags. Further, this commit removes the ETag computation from the `hash` package. Instead, the `hash` package (i.e. `hash.Reader`) should focus only on computing and verifying the content-sha256. One core feature of this commit is to provide a mechanism to communicate a computed ETag from a low-level `io.Reader` to a high-level `io.Reader`. This problem occurs when an S3 server receives a request and has to compute the ETag of the content. However, the server may also wrap the initial body with several other `io.Reader`, e.g. when encrypting or compressing the content: ``` reader := Encrypt(Compress(ETag(content))) ``` In such a case, the ETag should be accessible by the high-level `io.Reader`. The `etag` provides a mechanism to wrap `io.Reader` implementations such that the `ETag` can be accessed by a type-check. This technique is applied to the PUT, COPY and Upload handlers.	2021-02-23 12:31:53 -08:00
Harshavardhana	c31d2c3fdc	fix: CrawlAndGetDataUsage close pipe() before using a new one (#11600 ) also additionally make sure errors during deserializer closes the reader with right error type such that Write() end actually see the final error, this avoids a waitGroup usage and waiting.	2021-02-22 10:04:32 -08:00
Klaus Post	8a6b13c239	Avoid synchronizing usage writes (#11560 ) If the periodic `case <-t.C:` save gets held up for a long time it will end up synchronize all disk writes for saving the caches. We add jitter to per set writes so they don't sync up and don't hold a lock for the write, since it isn't needed anyway. If an outage prevents writes for a long while we also add individual waits for each disk in case there was a queue. Furthermore limit the number of buffers kept to 2GiB, since this could get huge in large clusters. This will not act as a hard limit but should be enough for normal operation.	2021-02-18 00:38:37 -08:00
Harshavardhana	ffea6fcf09	fix: rename crawler as scanner in config (#11549 )	2021-02-17 12:04:11 -08:00
Krishnan Parthasarathi	b87fae0049	Simplify PutObjReader for plain-text reader usage (#11470 ) This change moves away from a unified constructor for plaintext and encrypted usage. NewPutObjReader is simplified for the plain-text reader use. For encrypted reader use, WithEncryption should be called on an initialized PutObjReader. Plaintext: func NewPutObjReader(rawReader hash.Reader) PutObjReader The hash.Reader is used to provide payload size and md5sum to the downstream consumers. This is different from the previous version in that there is no need to pass nil values for unused parameters. Encrypted: func WithEncryption(encReader hash.Reader, key crypto.ObjectKey) (*PutObjReader, error) This method sets up encrypted reader along with the key to seal the md5sum produced by the plain-text reader (already setup when NewPutObjReader was called). Usage: ``` pReader := NewPutObjReader(rawReader) // ... other object handler code goes here // Prepare the encrypted hashed reader pReader, err = pReader.WithEncryption(encReader, objEncKey) ```	2021-02-10 08:52:50 -08:00
Harshavardhana	e0055609bb	fix: crawler to skip healing the drives in a set being healed (#11274 ) If an erasure set had a drive replacement recently, we don't need to attempt healing on another drive with in the same erasure set - this would ensure we do not double heal the same content and also prioritizes usage for such an erasure set to be calculated sooner.	2021-01-19 02:40:52 -08:00
Harshavardhana	628ef081d1	fix: preserve cache calculated previously while moving from v2 to v3 (#11269 ) This ensures that all the prometheus monitoring and usage trackers to avoid alerts configured, although we cannot support v1 to v2 here - we can v2 to v3.	2021-01-13 09:58:08 -08:00
Klaus Post	51dad1d130	Fix missing GetObjectNInfo Closure (#11243 ) Review for missing Close of returned value from `GetObjectNInfo`. This was often obscured by the stuff that auto-unlocks when reaching EOF.	2021-01-08 10:12:26 -08:00
Harshavardhana	3e1221a01c	fix: log once updating dataUsageCache versions (#11190 ) also reduce usage of *bytes.Buffer for reading `usage-cache.bin`	2020-12-31 09:45:09 -08:00
Klaus Post	4bca62a0bd	crawler: Stream bucket usage cache data (#11068 ) Stream bucket caches to storage and through RPC calls.	2020-12-10 13:03:22 -08:00
Harshavardhana	dc819afa44	fix: auto update crawler meta version PR `038bcd9079` introduced version '3', we need to make sure that we do not print an unexpected error instead log a message to indicate we will auto update the version.	2020-12-08 10:40:51 -08:00
Ritesh H Shukla	038bcd9079	Add replication capacity metrics support in crawler (#10786 )	2020-12-07 13:47:48 -08:00
Klaus Post	e6ea5c2703	crawler: Missing folder heal check per set (#10876 )	2020-12-01 12:07:39 -08:00
Harshavardhana	ca88ca753c	ignore typed errors correctly in list cache layer (#10879 ) bonus write bucket metadata cache with enough quorum Possible fix for #10868	2020-11-12 09:28:56 -08:00
Klaus Post	493c714663	Remove erasureSets and erasureObjects from ObjectLayer (#10442 )	2020-09-10 09:18:19 -07:00
Klaus Post	c097ce9c32	continous healing based on crawler (#10103 ) Design: https://gist.github.com/klauspost/792fe25c315caf1dd15c8e79df124914	2020-08-24 13:47:01 -07:00
Harshavardhana	caad314faa	add ruleguard support, fix all the reported issues (#10335 )	2020-08-24 12:11:20 -07:00
Klaus Post	00d3cc4b69	Enforce quota checks after crawl (#10036 ) Enforce bucket quotas when crawling has finished. This ensures that we will not do quota enforcement on old data. Additionally, delete less if we are closer to quota than we thought.	2020-07-14 18:59:05 -07:00
Klaus Post	43d6e3ae06	merge object lifecycle checks into usage crawler (#9579 )	2020-06-12 10:28:21 -07:00
Harshavardhana	53aaa5d2a5	Export bucket usage counts as part of bucket metrics (#9710 ) Bonus fixes in quota enforcement to use the new datastructure and use timedValue to cache a value/reload automatically avoids one less global variable.	2020-05-27 06:45:43 -07:00
Harshavardhana	498389123e	avoid unnecessary logging on fresh/newly replaced drives (#9470 ) data usage tracker and crawler seem to be logging non-actionable information on console, which is not useful and is fixed on its own in almost all deployments, lets keep this logging to minimal.	2020-04-28 01:16:57 -07:00
Klaus Post	073aac3d92	add data update tracking using bloom filter (#9208 ) By monitoring PUT/DELETE and heal operations it is possible to track changed paths and keep a bloom filter for this data. This can help prioritize paths to scan. The bloom filter can identify paths that have not changed, and the few collisions will only result in a marginal extra workload. This can be implemented on either a bucket+(1 prefix level) with reasonable performance. The bloom filter is set to have a false positive rate at 1% at 1M entries. A bloom table of this size is about ~2500 bytes when serialized. To not force a full scan of all paths that have changed cycle bloom filters would need to be kept, so we guarantee that dirty paths have been scanned within cycle runs. Until cycle bloom filters have been collected all paths are considered dirty.	2020-04-27 10:06:21 -07:00
Harshavardhana	b1a2169dcc	fix: data usage crawler env handling, usage-cache.bin location (#9163 ) canonicalize the ENVs such that we can bring these ENVs as part of the config values, as a subsequent change. - fix location of per bucket usage to `.minio.sys/buckets/<bucket_name>/usage-cache.bin` - fix location of the overall usage in `json` at `.minio.sys/buckets/.usage.json` (avoid conflicts with a bucket named `usage.json` ) - fix location of the overall usage in `msgp` at `.minio.sys/buckets/.usage.bin` (avoid conflicts with a bucket named `usage.bin`	2020-03-19 09:47:47 -07:00
Klaus Post	8d98662633	re-implement data usage crawler to be more efficient (#9075 ) Implementation overview: https://gist.github.com/klauspost/1801c858d5e0df391114436fdad6987b	2020-03-18 16:19:29 -07:00

27 commits