minio

Author	SHA1	Message	Date
Harshavardhana	83a82d818e	allow lock tolerance to match storage-class drive tolerance (#10270 )	2020-08-14 18:17:14 -07:00
Harshavardhana	fe157166ca	fix: Pass context all the way down to the network call in lockers (#10161 ) Context timeout might race on each other when timeouts are lower i.e when two lock attempts happened very quickly on the same resource and the servers were yet trying to establish quorum. This situation can lead to locks held which wouldn't be unlocked and subsequent lock attempts would fail. This would require a complete server restart. A potential of this issue happening is when server is booting up and we are trying to hold a 'transaction.lock' in quick bursts of timeout.	2020-07-29 23:15:34 -07:00
Harshavardhana	3b9fbf80ad	fix: make sure to use new restClient for healthcheck (#10026 ) Without instantiating a new rest client we can have a recursive error which can lead to healthcheck returning always offline, this can prematurely take the servers offline.	2020-07-11 22:19:38 -07:00
Klaus Post	968342c732	Remove usage of go-ieproxy for windows (#10009 ) There is a potential for deadlock on Windows 10 refer https://github.com/mattn/go-ieproxy/issues/17 remove this dependency for now.	2020-07-10 12:08:14 -07:00
Harshavardhana	d55f4336ae	preserve context per request for local locks (#9828 ) In the Current bug we were re-using the context from previously granted lockers, this would lead to lock timeouts for existing valid read or write locks, leading to premature timeout of locks. This bug affects only local lockers in FS or standalone erasure coded mode. This issue is rather historical as well and was present in lsync for some time but we were lucky to not see it. Similar changes are done in dsync as well to keep the code more familiar Fixes #9827	2020-06-14 07:43:10 -07:00
Harshavardhana	febe9cc26a	fix: avoid timer leaks in dsync/lsync (#9781 ) At a customer setup with lots of concurrent calls it can be observed that in newRetryTimer there were lots of tiny alloations which are not relinquished upon retries, in this codepath we were only interested in re-using the timer and use it wisely for each locker. ``` (pprof) top Showing nodes accounting for 8.68TB, 97.02% of 8.95TB total Dropped 1198 nodes (cum <= 0.04TB) Showing top 10 nodes out of 79 flat flat% sum% cum cum% 5.95TB 66.50% 66.50% 5.95TB 66.50% time.NewTimer 1.16TB 13.02% 79.51% 1.16TB 13.02% github.com/ncw/directio.AlignedBlock 0.67TB 7.53% 87.04% 0.70TB 7.78% github.com/minio/minio/cmd.xlObjects.putObject 0.21TB 2.36% 89.40% 0.21TB 2.36% github.com/minio/minio/cmd.(posix).Walk 0.19TB 2.08% 91.49% 0.27TB 2.99% os.statNolog 0.14TB 1.59% 93.08% 0.14TB 1.60% os.(File).readdirnames 0.10TB 1.09% 94.17% 0.11TB 1.25% github.com/minio/minio/cmd.readDirN 0.10TB 1.07% 95.23% 0.10TB 1.07% syscall.ByteSliceFromString 0.09TB 1.03% 96.27% 0.09TB 1.03% strings.(Builder).grow 0.07TB 0.75% 97.02% 0.07TB 0.75% path.(lazybuf).append ```	2020-06-08 11:28:40 -07:00
Harshavardhana	b768645fde	fix: unexpected logging with bucket metadata conversions (#9519 )	2020-05-04 20:04:06 -07:00
Harshavardhana	30707659b5	[feature] allow for an odd number of erasure packs (#9221 ) Too many deployments come up with an odd number of hosts or drives, to facilitate even distribution among those setups allow for odd and prime numbers based packs.	2020-03-31 09:32:16 -07:00
Harshavardhana	ab7d3cd508	fix: Speed up multi-object delete by taking bulk locks (#8974 ) Change distributed locking to allow taking bulk locks across objects, reduces usually 1000 calls to 1. Also allows for situations where multiple clients sends delete requests to objects with following names ``` {1,2,3,4,5} ``` ``` {5,4,3,2,1} ``` will block and ensure that we do not fail the request on each other.	2020-02-21 11:29:57 +05:30
Harshavardhana	5aa5dcdc6d	lock: improve locker initialization at init (#8776 ) Use reference format to initialize lockers during startup, also handle `nil` for NetLocker in dsync and remove errorLocker implementation Add further tuning parameters such as - DialTimeout is now 15 seconds from 30 seconds - KeepAliveTimeout is not 20 seconds, 5 seconds more than default 15 seconds - ResponseHeaderTimeout to 10 seconds - ExpectContinueTimeout is reduced to 3 seconds - DualStack is enabled by default remove setting it to `true` - Reduce IdleConnTimeout to 30 seconds from 1 minute to avoid idleConn build up Fixes #8773	2020-01-10 02:35:06 -08:00
Harshavardhana	347b29d059	Implement bucket expansion (#8509 )	2019-11-19 17:42:27 -08:00
Harshavardhana	e9b2bf00ad	Support MinIO to be deployed on more than 32 nodes (#8492 ) This PR implements locking from a global entity into a more localized set level entity, allowing for locks to be held only on the resources which are writing to a collection of disks rather than a global level. In this process this PR also removes the top-level limit of 32 nodes to an unlimited number of nodes. This is a precursor change before bring in bucket expansion.	2019-11-13 12:17:45 -08:00

12 commits