Commit graph

136 commits

Author SHA1 Message Date
Harshavardhana bd2131ba34
add DNS cache support to avoid DNS flooding (#10693)
Go stdlib resolver doesn't support caching DNS
resolutions, since we compile with CGO disabled
we are more probe to DNS flooding for all network
calls to resolve for DNS from the DNS server.

Under various containerized environments such as
VMWare this becomes a problem because there are
no DNS caches available and we may end up overloading
the kube-dns resolver under concurrent I/O.

To circumvent this issue implement a DNSCache resolver
which resolves DNS and caches them for around 10secs
with every 3sec invalidation attempted.
2020-10-16 14:49:05 -07:00
Harshavardhana 2760fc86af
Bump default idleConnsPerHost to control conns in time_wait (#10653)
This PR fixes a hang which occurs quite commonly at higher concurrency
by allowing following changes

- allowing lower connections in time_wait allows faster socket open's
- lower idle connection timeout to ensure that we let kernel
  reclaim the time_wait connections quickly
- increase somaxconn to 4096 instead of 2048 to allow larger tcp
  syn backlogs.

fixes #10413
2020-10-12 14:19:46 -07:00
Harshavardhana 736e58dd68
fix: handle concurrent lockers with multiple optimizations (#10640)
- select lockers which are non-local and online to have
  affinity towards remote servers for lock contention

- optimize lock retry interval to avoid sending too many
  messages during lock contention, reduces average CPU
  usage as well

- if bucket is not set, when deleteObject fails make sure
  setPutObjHeaders() honors lifecycle only if bucket name
  is set.

- fix top locks to list out always the oldest lockers always,
  avoid getting bogged down into map's unordered nature.
2020-10-08 12:32:32 -07:00
Harshavardhana 1f9abbee4d
make sure to release locks upon timeout (#10596)
fixes #10418
2020-09-29 15:18:34 -07:00
Harshavardhana 37a5d5d7a0
reduce timeouts between servers for faster disconnects (#10562) 2020-09-24 20:10:07 -07:00
Krishna Srinivas 230fc0d186
Support for "directory" objects (#10499) 2020-09-19 08:39:41 -07:00
Klaus Post b7438fe4e6
Copy metadata before spawning goroutine + prealloc maps (#10458)
In `(*cacheObjects).GetObjectNInfo` copy the metadata before spawning a goroutine.

Clean up a few map[string]string copies as well, reducing allocs and simplifying the code.

Fixes #10426
2020-09-10 11:37:22 -07:00
Harshavardhana c13afd56e8
Remove MaxConnsPerHost settings to avoid potential hangs (#10438)
MaxConnsPerHost can potentially hang a call without any
way to timeout, we do not need this setting for our proxy
and gateway implementations instead IdleConn settings are
good enough.

Also ensure to use NewRequestWithContext and make sure to
take the disks offline only for network errors.

Fixes #10304
2020-09-08 14:22:04 -07:00
Anis Elleuch 46ee8659b4
fix write quorum calculation for bucket operations (#10364)
When the number of disks is odd, the calculation of quorum 
for bucket operations were not correct, fix it.
2020-08-27 12:55:32 -07:00
Harshavardhana 1e2ebc9945
feat: time to bring back http2.0 support (#10230)
Bonus move our CI/CD to go1.14
2020-08-10 09:02:29 -07:00
Harshavardhana 0b8255529a
fix: proxies set keep-alive timeouts to be system dependent (#10199)
Split the DialContext's one for internode and another
for all other external communications especially
proxy forwarders, gateway transport etc.
2020-08-04 14:55:53 -07:00
Klaus Post 968342c732
Remove usage of go-ieproxy for windows (#10009)
There is a potential for deadlock on Windows 10
refer https://github.com/mattn/go-ieproxy/issues/17 

remove this dependency for now.
2020-07-10 12:08:14 -07:00
Harshavardhana 72e0745e2f
fix: migrate to go.etcd.io import path (#9987)
with the merge of https://github.com/etcd-io/etcd/pull/11823
etcd v3.5.0 will now have a properly imported versioned path

this fixes our pending migration to newer repo
2020-07-07 19:04:29 -07:00
Klaus Post aa4d1021eb
Remove timeout from putobject and listobjects (#9986)
Use a separate client for these calls that can take a long time.

Add request context to these so they are canceled when the client 
disconnects instead except for ListObject which doesn't have any equivalent.
2020-07-07 12:19:57 -07:00
Harshavardhana e59ee14f40
Tune tcp keep-alives with new kernel timeout options (#9963)
For more deeper understanding https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
2020-07-03 10:03:41 -07:00
Harshavardhana 4915433bd2
Support bucket versioning (#9377)
- Implement a new xl.json 2.0.0 format to support,
  this moves the entire marshaling logic to POSIX
  layer, top layer always consumes a common FileInfo
  construct which simplifies the metadata reads.
- Implement list object versions
- Migrate to siphash from crchash for new deployments
  for object placements.

Fixes #2111
2020-06-12 20:04:01 -07:00
Klaus Post 95814359bd
cache disk info to avoid repeated calls (#9682)
This value is requested on every upload when there are multiple zones.

Since this will result in an RPC call to every remote disk this scales 
quite badly in a distributed setup. Load every 1second interval.

2 servers, localhost only. In large distributed setups much bigger 
gains can be expected.

```
Operations: 21743 -> 22454
* Average: +3.28% (+0.0 MiB/s) throughput, +3.28% (+11.9) obj/s
* Fastest: +3.37% (+0.0 MiB/s) throughput, +3.37% (+13.0) obj/s
* 50% Median: +3.03% (+0.0 MiB/s) throughput, +3.03% (+11.2) obj/s
* Slowest: +8.03% (+0.0 MiB/s) throughput, +8.03% (+22.8) obj/s
```

For easy management of this a generic helper has been added.
2020-05-26 12:52:24 -07:00
Harshavardhana 6ac48a65cb
fix: use unused cacheMetrics code in prometheus (#9588)
remove all other unusued/deadcode
2020-05-13 08:15:26 -07:00
Anis Elleuch 6d76efb9bb
Add support of TCP fast open in internode calls (#9486) 2020-05-08 14:33:23 -07:00
Harshavardhana 60d415bb8a
deprecate/remove global WORM mode (#9436)
global WORM mode is a complex piece for which
the time has passed, with the advent of S3 compatible
object locking and retention implementation global
WORM is sort of deprecated, this has been mentioned
in our documentation for some time, now the time
has come for this to go.
2020-04-24 16:37:05 -07:00
poornas 582953260b
Increase response header timeout for gateway (#9400)
fixes: #9295
2020-04-21 19:21:27 -07:00
Sidhartha Mani 3e78ea8acc
improve obd tests and optimize network (#9378)
- keep long running obd network tests alive
- fix error - wrong number of parents in process OBD info
- ensure that osinfo does not error out when inside containers
- remove limit on max number of connections per client transport

The generic client transport uses a default limit of 64 conns per transport.
This could end up limiting and throttling usage, and artificially slowing
down the performance of MinIO even on hardware capable of doing better.
2020-04-18 11:06:11 -07:00
Klaus Post c4464e36c8
fix: limit HTTP transport tuables to affordable values (#9383)
Close connections pro-actively in transient calls
2020-04-17 11:20:56 -07:00
Harshavardhana f44cfb2863
use GlobalContext whenever possible (#9280)
This change is throughout the codebase to
ensure that all codepaths honor GlobalContext
2020-04-09 09:30:02 -07:00
Harshavardhana 30707659b5
[feature] allow for an odd number of erasure packs (#9221)
Too many deployments come up with an odd number
of hosts or drives, to facilitate even distribution
among those setups allow for odd and prime numbers
based packs.
2020-03-31 09:32:16 -07:00
Sidhartha Mani 0c80bf45d0
Implement oboard diagnostics admin API (#9024)
- Implement a graph algorithm to test network bandwidth from every 
  node to every other node
- Saturate any network bandwidth adaptively, accounting for slow 
  and fast network capacity
- Implement parallel drive OBD tests
- Implement a paging mechanism for OBD test to provide periodic updates to client
- Implement Sys, Process, Host, Mem OBD Infos
2020-03-26 21:07:39 -07:00
Anis Elleuch 791821d590
sa: Allow empty policy to indicate parent user's policy is inherited (#9185) 2020-03-23 14:17:18 -07:00
Harshavardhana 3d3beb6a9d
Add response header timeouts (#9170)
- Add conservative timeouts upto 3 minutes
  for internode communication
- Add aggressive timeouts of 30 seconds
  for gateway communication

Fixes #9105
Fixes #8732
Fixes #8881
Fixes #8376
Fixes #9028
2020-03-21 22:10:13 -07:00
Anis Elleuch 23a0415eb7
profiling: Fix crash when enabling goroutines profiling (#9097)
This commit replaces 'goroutines' with 'goroutine' when passing it
to pprof library when activating goroutine type profiling
2020-03-06 13:22:47 -08:00
poornas 9fc7537f2a
Enforce md5sum checks for object retention APIs (#9030)
this PR enforces md5sum verification for following
API's to be compatible with AWS S3 spec
 - PutObjectRetention
 - PutObjectLegalHold

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-03-04 07:04:12 -08:00
Klaus Post f1b2462193
Add goroutine profiles (#9078)
Allow downloading goroutine dump to help detect leaks
or overuse of goroutines.

Extensions are now type dependent.

Change `profiling` -> `profile` prefix, since that is what they are 
not the abstract concept.
2020-03-04 06:58:12 -08:00
Harshavardhana 712e82344c
acl: Support PUT calls with success for 'private' ACL's (#9000)
Add dummy calls which respond success when ACL's
are set to be private and fails, if user tries
to change them from their default 'private'

Some applications such as nuxeo may have an
unnecessary requirement for this operation,
we support this anyways such that don't have
to fully implement the functionality just that
we can respond with success for default ACLs
2020-02-16 11:37:52 +05:30
Harshavardhana c56c2f5fd3
fix routing issue for esoteric characters in gorilla/mux (#8967)
First step is to ensure that Path component is not decoded
by gorilla/mux to avoid routing issues while handling
certain characters while uploading through PutObject()

Delay the decoding and use PathUnescape() to escape
the `object` path component.

Thanks to @buengese and @ncw for neat test cases for us
to test with.

Fixes #8950
Fixes #8647
2020-02-12 09:08:02 +05:30
Harshavardhana d7dc9aaf52
fix: remove response header timeout (#8919)
Adding respone header timeout seems to have
premature timeout like consequences which
leads to potential disconnections.
2020-02-01 08:31:55 +05:30
Klaus Post c7178d2066 Profiling: Add base, fix memory profiling (#8850)
For 'snapshot' type profiles, record a 'before' profile that can be used 
as `go tool pprof -base=before ...` to compare before and after.

"Before" profiles are included in the zipped package.

[`runtime.MemProfileRate`](https://golang.org/pkg/runtime/#pkg-variables) 
should not be updated while the application is running, so we set it at startup.

Co-authored-by: Harshavardhana <harsha@minio.io>
2020-01-21 15:49:25 -08:00
Harshavardhana f14f60a487 fix: Avoid double usage calculation on every restart (#8856)
On every restart of the server, usage was being
calculated which is not useful instead wait for
sufficient time to start the crawling routine.

This PR also avoids lots of double allocations
through strings, optimizes usage of string builders
and also avoids crawling through symbolic links.

Fixes #8844
2020-01-21 14:07:49 -08:00
poornas 60e60f68dd Add support for object locking with legal hold. (#8634) 2020-01-16 15:41:56 -08:00
Klaus Post d8660b30cc Reduce MemProfileRate (#8814)
Enabling the memory profiling has a significant impact on performance.

Reduce the profiling rate by 2 orders of magnitude. It is still 128x smaller than default so it should be plenty.
2020-01-14 16:18:45 -08:00
poornas 30922148fb Fix bug preventing overwrite of object if (#8796)
object lock config is enabled for a bucket.

Creating a bucket with object lock configuration
enabled does not automatically cause WORM protection
to be applied. PUT operation needs to specifically
request object locking or bucket has to have default
retention settings configured.

Fixes regression introduced in #8657
2020-01-13 17:29:31 -08:00
Klaus Post 2bf6cf0e15 Enable multiple concurrent profile types (#8792) 2020-01-10 17:19:58 -08:00
Harshavardhana 5aa5dcdc6d
lock: improve locker initialization at init (#8776)
Use reference format to initialize lockers
during startup, also handle `nil` for NetLocker
in dsync and remove *errorLocker* implementation

Add further tuning parameters such as

 - DialTimeout is now 15 seconds from 30 seconds
 - KeepAliveTimeout is not 20 seconds, 5 seconds
   more than default 15 seconds
 - ResponseHeaderTimeout to 10 seconds
 - ExpectContinueTimeout is reduced to 3 seconds
 - DualStack is enabled by default remove setting
   it to `true`
 - Reduce IdleConnTimeout to 30 seconds from
   1 minute to avoid idleConn build up

Fixes #8773
2020-01-10 02:35:06 -08:00
Harshavardhana abc1c1070a Add custom policy claim name (#8764)
In certain organizations policy claim names
can be not just 'policy' but also things like
'roles', the value of this field might also
be *string* or *[]string* support this as well

In this PR we are still not supporting multiple
policies per STS account which will require a
more comprehensive change.
2020-01-08 17:21:58 -08:00
Harshavardhana 5f2318567e
Allow metadata updates on meta bucket even in WORM mode (#8657)
This ensures that we can update the

- .minio.sys is updated for accounting/data usage purposes
- .minio.sys is updated to indicate if backend is encrypted
  or not.
2019-12-17 10:13:12 -08:00
Anis Elleuch 555969ee42 Add data usage collect with its new admin API (#8553)
Admin data usage info API returns the following

(Only FS & XL, for now)

- Number of buckets
- Number of objects
- The total size of objects
- Objects histogram
- Bucket sizes
2019-12-12 06:02:37 -08:00
Harshavardhana fd0fa4e5c5 Add NTP retention time (#8548) 2019-11-21 18:22:35 +05:30
poornas ca96560d56 Add object retention at the per object (#8528)
level - this PR builds on #8120 which
added PutBucketObjectLockConfiguration and
GetBucketObjectLockConfiguration APIS

This PR implements PutObjectRetention,
GetObjectRetention API and enhances
PUT and GET API operations to display
governance metadata if permissions allow.
2019-11-20 13:18:09 -08:00
Harshavardhana e9b2bf00ad Support MinIO to be deployed on more than 32 nodes (#8492)
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.

In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
2019-11-13 12:17:45 -08:00
Bala FA fb48ca5020 Add Get/Put Bucket Lock Configuration API support (#8120)
This feature implements [PUT Bucket object lock configuration][1] and
[GET Bucket object lock configuration][2]. After object lock
configuration is set, existing and new objects are set to WORM for
specified duration. Currently Governance mode works exactly like
Compliance mode.

Fixes #8101

[1] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUTObjectLockConfiguration.html
[2] https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGETObjectLockConfiguration.html
2019-11-12 14:50:18 -08:00
Harshavardhana ee4a6a823d Migrate config to KV data format (#8392)
- adding oauth support to MinIO browser (#8400) by @kanagaraj
- supports multi-line get/set/del for all config fields
- add support for comments, allow toggle
- add extensive validation of config before saving
- support MinIO browser to support proper claims, using STS tokens
- env support for all config parameters, legacy envs are also
  supported with all documentation now pointing to latest ENVs
- preserve accessKey/secretKey from FS mode setups
- add history support implements three APIs
  - ClearHistory
  - RestoreHistory
  - ListHistory
- add help command support for each config parameters
- all the bug fixes after migration to KV, and other bug
  fixes encountered during testing.
2019-10-22 22:59:13 -07:00
Praveen raj Mani 8700945cdf Handle connection failures on webhook/url pings (#8204)
Properly handle connection failures while replaying events

Fixes #8194
2019-09-12 16:44:51 -07:00