Commit graph

77 commits

Author SHA1 Message Date
Harshavardhana 347b29d059 Implement bucket expansion (#8509) 2019-11-19 17:42:27 -08:00
Harshavardhana e9b2bf00ad Support MinIO to be deployed on more than 32 nodes (#8492)
This PR implements locking from a global entity into
a more localized set level entity, allowing for locks
to be held only on the resources which are writing
to a collection of disks rather than a global level.

In this process this PR also removes the top-level
limit of 32 nodes to an unlimited number of nodes. This
is a precursor change before bring in bucket expansion.
2019-11-13 12:17:45 -08:00
Praveen raj Mani fa325665b1 Do not append the endpoint for fs/xl disks in StorageInfo (#8472) 2019-10-31 09:13:54 -07:00
Praveen raj Mani 8836d57e3c The prometheus metrics refractoring (#8003)
The measures are consolidated to the following metrics

- `disk_storage_used` : Disk space used by the disk.
- `disk_storage_available`: Available disk space left on the disk.
- `disk_storage_total`: Total disk space on the disk.
- `disks_offline`: Total number of offline disks in current MinIO instance.
- `disks_total`: Total number of disks in current MinIO instance.
- `s3_requests_total`: Total number of s3 requests in current MinIO instance.
- `s3_errors_total`: Total number of errors in s3 requests in current MinIO instance.
- `s3_requests_current`: Total number of active s3 requests in current MinIO instance.
- `internode_rx_bytes_total`: Total number of internode bytes received by current MinIO server instance.
- `internode_tx_bytes_total`: Total number of bytes sent to the other nodes by current MinIO server instance.
- `s3_rx_bytes_total`: Total number of s3 bytes received by current MinIO server instance.
- `s3_tx_bytes_total`: Total number of s3 bytes sent by current MinIO server instance.
- `minio_version_info`: Current MinIO version with commit-id.
- `s3_ttfb_seconds_bucket`: Histogram that holds the latency information of the requests.

And this PR also modifies the current StorageInfo queries

- Decouples StorageInfo from ServerInfo .
- StorageInfo is enhanced to give endpoint information.

NOTE: ADMIN API VERSION IS BUMPED UP IN THIS PR

Fixes #7873
2019-10-22 21:01:14 -07:00
Harshavardhana 68a519a468
Use errgroups instead of sync.WaitGroup as needed (#8354) 2019-10-14 09:44:51 -07:00
Harshavardhana 3b8adf7528 Move storageclass config handling into cmd/config/storageclass (#8360)
Continuation of the changes done in PR #8351 to refactor,
add tests and move global handling into a more idiomatic
style for Go as packages.
2019-10-07 11:20:24 +05:30
Harshavardhana 0cd0f6c255
Avoid error modification during IAM migration (#8156)
The underlying errors are important, for IAM
requirements and should wait appropriately at
the caller level, this allows for distributed
setups to run properly and not fail prematurely
during startup.

Also additionally fix the onlineDisk counting
2019-08-30 10:41:02 -07:00
Harshavardhana d6dd98e597
Avoid data-race in getDisksInfo call (#8126) 2019-08-23 17:03:15 -07:00
Praveen raj Mani e211f6f52e Parallelize the DiskInfo calls in xl.StorageInfo() (#8115) 2019-08-22 20:02:40 -07:00
Harshavardhana 620e462413 Implement S3-HDFS gateway (#7440)
- [x] Support bucket and regular object operations
- [x] Supports Select API on HDFS
- [x] Implement multipart API support
- [x] Completion of ListObjects support
2019-04-17 09:52:08 -07:00
kannappanr 5ecac91a55
Replace Minio refs in docs with MinIO and links (#7494) 2019-04-09 11:39:42 -07:00
Harshavardhana 0188009c7e Expose total and available disk space (#7453) 2019-04-05 09:51:50 +05:30
Harshavardhana df35d7db9d Introduce staticcheck for stricter builds (#7035) 2019-02-13 18:29:36 +05:30
Harshavardhana a63bc9254d Add 'disk' tag to log output to enhance 'disk not found' errors (#6460) 2018-09-13 21:42:50 -07:00
Krishna Srinivas 52f6d5aafc Rename of structs and methods (#6230)
Rename of ErasureStorage to Erasure (and rename of related variables and methods)
2018-08-23 23:35:37 -07:00
Harshavardhana 556a51120c Deprecate ListLocks and ClearLocks (#6233)
No locks are ever left in memory, we also
have a periodic interval of clearing stale locks
anyways. The lock instrumentation was not complete
and was seldom used.

Deprecate this for now and bring it back later if
it is really needed. This also in-turn seems to improve
performance slightly.
2018-08-02 23:09:42 +05:30
Harshavardhana ad86454580 Make sure to handle FaultyDisks in listing ops (#6204)
Continuing from PR 157ed65c35

Our posix.go implementation did not handle  I/O errors
properly on the disks, this led to situations where
top-level callers such as ListObjects might return early
without even verifying all the available disks.

This commit tries to address this in Kubernetes, drbd/nbd based
persistent volumes which can disconnect under load and
result in the situations with disks return I/O errors.

This commit also simplifies listing operation, listing
never returns any error. We can avoid this since we pretty
much ignore most of the errors anyways. When objects are
accessed directly we return proper errors.
2018-07-27 15:32:19 -07:00
Harshavardhana 000e360196 Deprecate showing drive capacity and total free (#5976)
This addresses a situation that we shouldn't be
displaying Total/Free anymore, instead we should simply
show the total usage.
2018-05-23 17:30:25 -07:00
Harshavardhana e6ec645035 Implement support for calculating disk usage per tenant (#5969)
Fixes #5961
2018-05-23 15:41:29 +05:30
Bala FA 0d52126023 Enhance policy handling to support SSE and WORM (#5790)
- remove old bucket policy handling
- add new policy handling
- add new policy handling unit tests

This patch brings support to bucket policy to have more control not
limiting to anonymous.  Bucket owner controls to allow/deny any rest
API.

For example server side encryption can be controlled by allowing
PUT/GET objects with encryptions including bucket owner.
2018-04-24 15:53:30 -07:00
kannappanr cef992a395
Remove error package and cause functions (#5784) 2018-04-10 09:36:37 -07:00
kannappanr f8a3fd0c2a
Create logger package and rename errorIf to LogIf (#5678)
Removing message from error logging
Replace errors.Trace with LogIf
2018-04-05 15:04:40 -07:00
Harshavardhana 85a57d2021 Make sure to close the disk connections (#5752)
Since we do not re-use storageDisks after moving
the connections to object layer we should close them
appropriately otherwise we have a lot of connection
leaks and these can compound as the time goes by.

This PR also refactors the initialization code to
re-use storageDisks for given set of endpoints until
we have confirmed a valid reference format.
2018-04-04 10:28:48 +05:30
Harshavardhana 6e9c853312 After healing re-load disks with the new format (#5718)
This PR also fixes correct calculation of drive states
before and after healing of objects.

Fixes #5700
Fixes #5708
2018-03-28 06:41:39 +05:30
Krishna Srinivas e452377b24 Add context to the object-interface methods.
Make necessary changes to xl fs azure sia
2018-03-15 16:28:25 -07:00
Krishna Srinivas 9083bc152e Flat multipart backend implementation for Erasure backend (#5447) 2018-03-15 13:55:23 -07:00
Harshavardhana fb96779a8a Add large bucket support for erasure coded backend (#5160)
This PR implements an object layer which
combines input erasure sets of XL layers
into a unified namespace.

This object layer extends the existing
erasure coded implementation, it is assumed
in this design that providing > 16 disks is
a static configuration as well i.e if you started
the setup with 32 disks with 4 sets 8 disks per
pack then you would need to provide 4 sets always.

Some design details and restrictions:

- Objects are distributed using consistent ordering
  to a unique erasure coded layer.
- Each pack has its own dsync so locks are synchronized
  properly at pack (erasure layer).
- Each pack still has a maximum of 16 disks
  requirement, you can start with multiple
  such sets statically.
- Static sets set of disks and cannot be
  changed, there is no elastic expansion allowed.
- Static sets set of disks and cannot be
  changed, there is no elastic removal allowed.
- ListObjects() across sets can be noticeably
  slower since List happens on all servers,
  and is merged at this sets layer.

Fixes #5465
Fixes #5464
Fixes #5461
Fixes #5460
Fixes #5459
Fixes #5458
Fixes #5460
Fixes #5488
Fixes #5489
Fixes #5497
Fixes #5496
2018-02-15 17:45:57 -08:00
Harshavardhana 8de6cf4124 update dsync implementation to fix a regression (#5513)
Currently minio master requires 4 servers, we
have decided to run on a minimum of 2 servers
instead - fixes a regression from previous
releases where 3 server setups were supported.
2018-02-12 15:16:12 +05:30
poornas 4f73fd9487 Unify gateway and object layer. (#5487)
* Unify gateway and object layer. Bring bucket policies into
object layer.
2018-02-09 15:19:30 -08:00
Harshavardhana 0c880bb852 Deprecate and remove in-memory object caching (#5481)
in-memory caching cannot be cleanly implemented
without the access to GC which Go doesn't naturally
provide. At times we have seen that object caching
is more of an hindrance rather than a boon for
our use cases.

Removing it completely from our implementation
  related to #5160 and #5182
2018-02-02 10:17:13 -08:00
Nitish Tiwari e2d5a87b26 Fix free and total space reported in startup banner (#5419)
With storage class support, the free and total space
reported in Minio XL startup banner should be based on
totalDisks - standardClassParityDisks, instead of totalDisks/2.

fixes #5416
2018-01-17 11:25:51 -08:00
poornas 0bb6247056 Move nslocking from s3 layer to object layer (#5382)
Fixes #5350
2018-01-13 10:04:52 +05:30
Nitish Tiwari 42633748db
Update madmin package to return storage class parity (#5387)
After the addition of Storage Class support, readQuorum
and writeQuorum are decided on a per object basis, instead
of deployment wide static quorums.

This PR updates madmin api to remove readQuorum/writeQuorum
and add Standard storage class and reduced redundancy storage
class parity as return values. Since these parity values are
used to decide the quorum for each object.

Fixes #5378
2018-01-12 07:52:52 +05:30
Nitish Tiwari 545a9e4a82 Fix storage class related issues (#5322)
- Add storage class metadata validation for request header
- Change storage class header values to be consistent with AWS S3
- Refactor internal method to take only the reqd argument
2017-12-27 10:06:16 +05:30
Nitish Tiwari 1a3dbbc9dd
Add x-amz-storage-class support (#5295)
This adds configurable data and parity options on a per object
basis. To use variable parity

- Users can set environment variables to cofigure variable
parity

- Then add header x-amz-storage-class to putobject requests
with relevant storage class values

Fixes #4997
2017-12-22 16:58:13 +05:30
Harshavardhana 490c30f853
erasure: Support cleaning up of stale multipart objects (#5250)
Just like our single directory/disk setup, this PR brings
the functionality to cleanup stale multipart objects
older > 2 weeks.
2017-11-30 18:11:42 -08:00
Harshavardhana 8efa82126b
Convert errors tracer into a separate package (#5221) 2017-11-25 11:58:29 -08:00
Nitish Tiwari fcc61fa46a Remove minimum inodes reqd check (#4747) 2017-08-03 20:07:22 -07:00
Harshavardhana 075b8903d7 fs: Add safe locking semantics for format.json (#4523)
This patch also reverts previous changes which were
merged for migration to the newer disk format. We will
be bringing these changes in subsequent releases. But
we wish to add protection in this release such that
future release migrations are protected.

Revert "fs: Migration should handle bucketConfigs as regular objects. (#4482)"
This reverts commit 976870a391.

Revert "fs: Migrate object metadata to objects directory. (#4195)"
This reverts commit 76f4f20609.
2017-06-12 17:40:28 -07:00
Harshavardhana 7765081db7 cache: Increasing caching GC percent from 20 to 50. (#4041)
Previous value was set to avoid large cache value build
up but we can clearly see this can cause lots of GC
pauses which can lead to significant drop in performance.

Change this value to 50% and decrease the value to 25%
once the 75% cache size is used. To have a larger
window for GC pauses.

Another change is to only allow caching if a server has
more than 24GB of RAM instead of 8GB.
2017-04-15 02:16:49 -07:00
Aditya Manthramurthy 604417baf4 Allow cluster to start when only n/2 servers are up (#4066)
Fixes #3234.

Relaxes the quorum requirement to start the object layer, and skips
quick-healing at start-up (as no write quorum is present).
2017-04-09 00:28:27 -07:00
Bala FA 2df8160f6a server: handle command line and env variables at one place. (#3975) 2017-03-30 11:21:19 -07:00
Harshavardhana 43317530d5 Fix odd shadowing bug in XL init. (#3874)
Fixes #3873
2017-03-08 20:42:45 -08:00
Harshavardhana 47ac410ab0 Code cleanup - simplify server side code. (#3870)
Fix all the issues reported by `gosimple` tool.
2017-03-08 10:00:47 -08:00
Harshavardhana e49efcb9d9 xl: quickHeal heal bucket only when needed. (#3854)
This improves the startup time significantly
for clusters which have lot of buckets.

Also fixes a bug where `.minio.sys` is created
on disks which do not have `format.json`
2017-03-06 02:00:15 -08:00
Bala FA 208dd15245 Remove globalMaxCacheSize and globalCacheExpiry variables (#3826)
This patch fixes below

* Remove global variables globalMaxCacheSize and globalCacheExpiry.
* Make global variables into constant in objcache package.
2017-03-02 10:34:37 -08:00
Harshavardhana 9df01035da Remove XL references in public docs to Erasure. (#3725)
Ref #3722
2017-02-09 23:26:44 -08:00
Krishnan Parthasarathi 586058f079 Implement mgmt REST APIs to heal storage format. (#3604)
* Implement heal format REST API handler
* Implement admin peer rpc handler to re-initialize storage
* Implement HealFormat API in pkg/madmin
* Update pkg/madmin API.md to incl. HealFormat
* Added unit tests for ReInitDisks rpc handler and HealFormatHandler
2017-01-23 00:32:55 -08:00
Harshavardhana 1c699d8d3f fs: Re-implement object layer to remember the fd (#3509)
This patch re-writes FS backend to support shared backend sharing locks for safe concurrent access across multiple servers.
2017-01-16 17:05:00 -08:00
Harshavardhana 7bbb532b4b Add a isErr function to check for errs.
DisksInfo() should handle collection of some
base errors as offlineDisks.
2017-01-02 10:52:43 -08:00