Allow null versions to be replicated (#13310)

for pre-existing objects present in a bucket
prior to enabling existing object replication.

Co-authored-by: Poorna Krishnamoorthy <poorna@minio.io>
This commit is contained in:
Poorna Krishnamoorthy 2021-09-28 13:26:12 -04:00 committed by GitHub
parent 38027c8f52
commit 7f6ed35347
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 22 additions and 7 deletions

View file

@ -732,7 +732,12 @@ func equals(k1 string, keys ...string) bool {
}
// returns replicationAction by comparing metadata between source and target
func getReplicationAction(oi1 ObjectInfo, oi2 minio.ObjectInfo) replicationAction {
func getReplicationAction(oi1 ObjectInfo, oi2 minio.ObjectInfo, opType replication.Type) replicationAction {
// Avoid resyncing null versions created prior to enabling replication if target has a newer copy
if opType == replication.ExistingObjectReplicationType &&
oi1.ModTime.Unix() > oi2.LastModified.Unix() && oi1.VersionID == nullVersionID {
return replicateNone
}
// needs full replication
if oi1.ETag != oi2.ETag ||
oi1.VersionID != oi2.VersionID ||
@ -1053,9 +1058,19 @@ func replicateObjectToTarget(ctx context.Context, ri ReplicateObjectInfo, object
ReplicationProxyRequest: "false",
}})
if cerr == nil {
rAction = getReplicationAction(objInfo, oi)
rAction = getReplicationAction(objInfo, oi, ri.OpType)
rinfo.ReplicationStatus = replication.Completed
if rAction == replicateNone {
if ri.OpType == replication.ExistingObjectReplicationType &&
objInfo.ModTime.Unix() > oi.LastModified.Unix() && objInfo.VersionID == nullVersionID {
logger.LogIf(ctx, fmt.Errorf("Unable to replicate %s/%s (null). Newer version exists on target", bucket, object))
sendEvent(eventArgs{
EventName: event.ObjectReplicationNotTracked,
BucketName: bucket,
Object: objInfo,
Host: "Internal: [Replication]",
})
}
// object with same VersionID already exists, replication kicked off by
// PutObject might have completed
if objInfo.TargetReplicationStatus(tgt.ARN) == replication.Pending || objInfo.TargetReplicationStatus(tgt.ARN) == replication.Failed || ri.OpType == replication.ExistingObjectReplicationType {
@ -1651,7 +1666,7 @@ func (c replicationConfig) Resync(ctx context.Context, oi ObjectInfo, dsc *Repli
return
}
// existing object replication does not apply to un-versioned objects
if oi.VersionID == "" || oi.VersionID == nullVersionID {
if oi.VersionID == "" {
return
}

View file

@ -39,7 +39,7 @@ Note that synchronous replication, i.e. when remote target is configured with --
Existing object replication works similar to regular replication. Objects qualifying for existing object replication are detected when scanner runs, and will be replicated if existing object replication is enabled and applicable replication rules are satisfied. Because replication depends on the immutability of versions, only pre-existing objects created while versioning was enabled can be replicated. Even if replication rules are disabled and re-enabled later, the objects created during the interim will be synced as the scanner queues them. For saving iops, objects qualifying for
existing object replication are not marked as `PENDING` prior to replication.
Note that objects with `null` versions, i.e. objects created prior to enabling versioning cannot be replicated as this would break the immutability guarantees provided by versioning. For replicating such objects, `mc cp alias/bucket/object alias/bucket/object` can be performed to create a server side copy of the object as a versioned object - this versioned object will replicate if replication is enabled and the previously present `null` version can then be deleted.
Note that objects with `null` versions, i.e. objects created prior to enabling versioning break the immutability guarantees provided by versioning. When existing object replication is enabled, these objects will be replicated as `null` versions to the remote targets provided they are not present on the target or if `null` version of object on source is newer than the `null` version of object on target.
If the remote site is fully lost and objects previously replicated need to be re-synced, the `mc replicate resync` command with optional flag of `--older-than` needs to be used to trigger re-syncing of previously replicated objects. This command generates a ResetID which is a unique UUID saved to the remote target config along with the applicable date(defaults to time of initiating the reset). All objects created prior to this date are eligible for re-replication if existing object replication is enabled for the replication rule the object satisfies. At the time of completion of replication, `X-Minio-Replication-Reset-Status` is set in the metadata with the timestamp of replication and ResetID. For saving iops, the objects which are re-replicated are not first set to `PENDING` state.

View file

@ -213,8 +213,6 @@ Existing object replication as detailed [here](https://aws.amazon.com/blogs/stor
Once existing object replication is enabled, all objects or object prefixes that satisfy the replication rules and were created prior to adding replication configuration OR while replication rules were disabled will be synced to the target cluster. Depending on the number of previously existing objects, the existing objects that are now eligible to be replicated will eventually be synced to the target cluster as the scanner schedules them. This may be slower depending on the load on the cluster, latency and size of the namespace.
Note that existing object replication requires that the objects being replicated were created while versioning was enabled. Objects with null version (i.e. those created when versioning is not enabled or versioning is suspended) will not be replicated.
In the rare event that target DR site is entirely lost and previously replicated objects to the DR cluster need to be re-replicated, `mc replicate resync alias/bucket` can be used to initiate a reset. This would initiate a re-sync between the two clusters on a lower priority as the scanner picks up these objects to re-sync.
This is an expensive operation and should be initiated only once - progress of the syncing can be monitored by looking at Prometheus metrics. If object version has been re-replicated, `mc stat --vid --debug` on this version shows an additional header `X-Minio-Replication-Reset-Status` with the replication timestamp and ResetID generated at the time of issuing the `mc replicate resync` command.

2
go.mod
View file

@ -46,7 +46,7 @@ require (
github.com/minio/highwayhash v1.0.2
github.com/minio/kes v0.14.0
github.com/minio/madmin-go v1.1.6
github.com/minio/minio-go/v7 v7.0.15-0.20210921183434-174b4c070788
github.com/minio/minio-go/v7 v7.0.15-0.20210928020726-a58653d41dd8
github.com/minio/parquet-go v1.0.0
github.com/minio/pkg v1.1.3
github.com/minio/selfupdate v0.3.1

2
go.sum
View file

@ -1032,6 +1032,8 @@ github.com/minio/minio-go/v7 v7.0.11-0.20210607181445-e162fdb8e584/go.mod h1:Woy
github.com/minio/minio-go/v7 v7.0.14/go.mod h1:S23iSP5/gbMwtxeY5FM71R+TkAYyzEdoNEDDwpt8yWs=
github.com/minio/minio-go/v7 v7.0.15-0.20210921183434-174b4c070788 h1:O+/N9vxhoObjuCuQczycuzdG240SoLrrdnyipJ5JJc0=
github.com/minio/minio-go/v7 v7.0.15-0.20210921183434-174b4c070788/go.mod h1:pUV0Pc+hPd1nccgmzQF/EXh48l/Z/yps6QPF1aaie4g=
github.com/minio/minio-go/v7 v7.0.15-0.20210928020726-a58653d41dd8 h1:+8oaaj9Gm/yJFvR09Mz4iefX7pSYcFw6d6fMfvr1Mow=
github.com/minio/minio-go/v7 v7.0.15-0.20210928020726-a58653d41dd8/go.mod h1:pUV0Pc+hPd1nccgmzQF/EXh48l/Z/yps6QPF1aaie4g=
github.com/minio/operator v0.0.0-20210812082324-26350f153661 h1:dGAJHpfmhNukFg0M0wDqH+G1OB2YPgZCcT6uv4n9YQk=
github.com/minio/operator v0.0.0-20210812082324-26350f153661/go.mod h1:zQqn6VGT46xlSpVXh1I/VZRv+eSgHtVu6URdg71YKX8=
github.com/minio/operator/logsearchapi v0.0.0-20210812082324-26350f153661 h1:tJw15hS3b1dVTf5PwA4roXZ/oRNnHyZ/8Y+yNTmQ5rA=