2018-05-22 21:43:36 +02:00
|
|
|
// Copyright 2016-2018, Pulumi Corporation.
|
|
|
|
//
|
|
|
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
// you may not use this file except in compliance with the License.
|
|
|
|
// You may obtain a copy of the License at
|
|
|
|
//
|
|
|
|
// http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
//
|
|
|
|
// Unless required by applicable law or agreed to in writing, software
|
|
|
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
// See the License for the specific language governing permissions and
|
|
|
|
// limitations under the License.
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
|
|
|
|
package resource
|
|
|
|
|
2017-04-17 22:34:19 +02:00
|
|
|
import (
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
"archive/tar"
|
|
|
|
"archive/zip"
|
2017-04-17 22:34:19 +02:00
|
|
|
"bytes"
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
"compress/gzip"
|
2017-10-22 22:39:21 +02:00
|
|
|
"crypto/sha256"
|
|
|
|
"encoding/hex"
|
2017-11-10 20:34:08 +01:00
|
|
|
"fmt"
|
2017-04-17 22:34:19 +02:00
|
|
|
"io"
|
|
|
|
"io/ioutil"
|
|
|
|
"net/http"
|
|
|
|
"net/url"
|
|
|
|
"os"
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
"path/filepath"
|
2017-10-22 22:39:21 +02:00
|
|
|
"reflect"
|
2018-03-31 21:08:48 +02:00
|
|
|
"regexp"
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
"sort"
|
|
|
|
"strings"
|
2018-06-07 21:11:12 +02:00
|
|
|
"time"
|
2017-04-17 22:34:19 +02:00
|
|
|
|
2017-04-19 23:46:50 +02:00
|
|
|
"github.com/pkg/errors"
|
|
|
|
|
2017-09-22 04:18:21 +02:00
|
|
|
"github.com/pulumi/pulumi/pkg/util/contract"
|
2018-04-05 21:55:56 +02:00
|
|
|
"github.com/pulumi/pulumi/pkg/util/httputil"
|
2017-10-22 22:39:21 +02:00
|
|
|
"github.com/pulumi/pulumi/pkg/workspace"
|
2017-04-17 22:34:19 +02:00
|
|
|
)
|
|
|
|
|
|
|
|
// Asset is a serialized asset reference. It is a union: thus, only one of its fields will be non-nil. Several helper
|
|
|
|
// routines exist as members in order to easily interact with the assets referenced by an instance of this type.
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
type Asset struct {
|
2018-11-01 16:28:11 +01:00
|
|
|
// Sig is the unique asset type signature (see properties.go).
|
|
|
|
Sig string `json:"4dabf18193072939515e22adb298388d" yaml:"4dabf18193072939515e22adb298388d"`
|
|
|
|
// Hash is the SHA256 hash of the asset's contents.
|
|
|
|
Hash string `json:"hash,omitempty" yaml:"hash,omitempty"`
|
|
|
|
// Text is set to a non-empty value for textual assets.
|
|
|
|
Text string `json:"text,omitempty" yaml:"text,omitempty"`
|
|
|
|
// Path will contain a non-empty path to the file on the current filesystem for file assets.
|
|
|
|
Path string `json:"path,omitempty" yaml:"path,omitempty"`
|
|
|
|
// URI will contain a non-empty URI (file://, http://, https://, or custom) for URI-backed assets.
|
|
|
|
URI string `json:"uri,omitempty" yaml:"uri,omitempty"`
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
}
|
|
|
|
|
2017-07-14 21:28:43 +02:00
|
|
|
const (
|
2017-07-17 19:38:57 +02:00
|
|
|
AssetSig = "c44067f5952c0a294b673a41bacd8c17" // a randomly assigned type hash for assets.
|
2017-10-22 22:39:21 +02:00
|
|
|
AssetHashProperty = "hash" // the dynamic property for an asset's hash.
|
2017-07-17 19:38:57 +02:00
|
|
|
AssetTextProperty = "text" // the dynamic property for an asset's text.
|
|
|
|
AssetPathProperty = "path" // the dynamic property for an asset's path.
|
|
|
|
AssetURIProperty = "uri" // the dynamic property for an asset's URI.
|
2017-07-14 21:28:43 +02:00
|
|
|
)
|
|
|
|
|
2017-10-23 03:30:42 +02:00
|
|
|
// NewTextAsset produces a new asset and its corresponding SHA256 hash from the given text.
|
2017-10-22 22:39:21 +02:00
|
|
|
func NewTextAsset(text string) (*Asset, error) {
|
|
|
|
a := &Asset{Sig: AssetSig, Text: text}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
|
|
|
|
2017-10-23 03:30:42 +02:00
|
|
|
// NewPathAsset produces a new asset and its corresponding SHA256 hash from the given filesystem path.
|
2017-10-22 22:39:21 +02:00
|
|
|
func NewPathAsset(path string) (*Asset, error) {
|
|
|
|
a := &Asset{Sig: AssetSig, Path: path}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
2017-05-29 22:08:30 +02:00
|
|
|
|
2017-10-23 03:30:42 +02:00
|
|
|
// NewURIAsset produces a new asset and its corresponding SHA256 hash from the given network URI.
|
2017-10-22 22:39:21 +02:00
|
|
|
func NewURIAsset(uri string) (*Asset, error) {
|
|
|
|
a := &Asset{Sig: AssetSig, URI: uri}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
|
2018-07-05 23:30:35 +02:00
|
|
|
func (a *Asset) IsText() bool { return !a.IsPath() && !a.IsURI() }
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) IsPath() bool { return a.Path != "" }
|
|
|
|
func (a *Asset) IsURI() bool { return a.URI != "" }
|
|
|
|
|
|
|
|
func (a *Asset) GetText() (string, bool) {
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
if a.IsText() {
|
2017-07-17 19:38:57 +02:00
|
|
|
return a.Text, true
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
}
|
|
|
|
return "", false
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) GetPath() (string, bool) {
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
if a.IsPath() {
|
2017-07-17 19:38:57 +02:00
|
|
|
return a.Path, true
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
}
|
|
|
|
return "", false
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) GetURI() (string, bool) {
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
if a.IsURI() {
|
2017-07-17 19:38:57 +02:00
|
|
|
return a.URI, true
|
Introduce assets
This change introduces the basic concept of assets. It is far from
fully featured, however, it is enough to start adding support for various
storage kinds that require access to I/O-backed data (files, etc).
The challenge is that Coconut is deterministic by design, and so you
cannot simply read a file in an ad-hoc manner and present the bytes to
a resource provider. Instead, we will model "assets" as first class
entities whose data source is described to the system in a more declarative
manner, so that the system and resource providers can manage them.
There are three ways to create an asset at the moment:
1. A constant, in-memory string.
2. A path to a file on the local filesystem.
3. A URI, whose scheme is extensible.
Eventually, we want to support byte blobs, but due to our use of a
"JSON-like" type system, this isn't easily expressible just yet.
The URI scheme is extensible in that file://, http://, and https://
are supported "out of the box", but individual providers are free to
recognize their own schemes and support them. For instance, copying
one S3 object to another will be supported simply by passing a URI
with the s3:// protocol in the usual way.
Many utility functions are yet to be written, but this is a start.
2017-04-17 22:00:26 +02:00
|
|
|
}
|
|
|
|
return "", false
|
|
|
|
}
|
2017-04-17 22:34:19 +02:00
|
|
|
|
2018-03-31 21:08:48 +02:00
|
|
|
var (
|
|
|
|
functionRegexp = regexp.MustCompile(`function __.*`)
|
|
|
|
withRegexp = regexp.MustCompile(` with\({ .* }\) {`)
|
|
|
|
environmentRegexp = regexp.MustCompile(` }\).apply\(.*\).apply\(this, arguments\);`)
|
|
|
|
preambleRegexp = regexp.MustCompile(
|
|
|
|
`function __.*\(\) {\n return \(function\(\) {\n with \(__closure\) {\n\nreturn `)
|
|
|
|
postambleRegexp = regexp.MustCompile(
|
|
|
|
`;\n\n }\n }\).apply\(__environment\).apply\(this, arguments\);\n}`)
|
|
|
|
)
|
|
|
|
|
|
|
|
// IsUserProgramCode checks to see if this is the special asset containing the users's code
|
|
|
|
func (a *Asset) IsUserProgramCode() bool {
|
|
|
|
if !a.IsText() {
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
|
|
|
|
text := a.Text
|
|
|
|
|
|
|
|
return functionRegexp.MatchString(text) &&
|
|
|
|
withRegexp.MatchString(text) &&
|
|
|
|
environmentRegexp.MatchString(text)
|
|
|
|
}
|
|
|
|
|
|
|
|
// MassageIfUserProgramCodeAsset takes the text for a function and cleans it up a bit to make the
|
|
|
|
// user visible diffs less noisy. Specifically:
|
|
|
|
// 1. it tries to condense things by changling multiple blank lines into a single blank line.
|
|
|
|
// 2. it normalizs the sha hashes we emit so that changes to them don't appear in the diff.
|
|
|
|
// 3. it elides the with-capture headers, as changes there are not generally meaningful.
|
|
|
|
//
|
|
|
|
// TODO(https://github.com/pulumi/pulumi/issues/592) this is baking in a lot of knowledge about
|
|
|
|
// pulumi serialized functions. We should try to move to an alternative mode that isn't so brittle.
|
|
|
|
// Options include:
|
|
|
|
// 1. Have a documented delimeter format that plan.go will look for. Have the function serializer
|
|
|
|
// emit those delimeters around code that should be ignored.
|
|
|
|
// 2. Have our resource generation code supply not just the resource, but the "user presentable"
|
|
|
|
// resource that cuts out a lot of cruft. We could then just diff that content here.
|
|
|
|
func MassageIfUserProgramCodeAsset(asset *Asset, debug bool) *Asset {
|
|
|
|
if debug {
|
|
|
|
return asset
|
|
|
|
}
|
|
|
|
|
|
|
|
// Only do this for strings that match our serialized function pattern.
|
|
|
|
if !asset.IsUserProgramCode() {
|
|
|
|
return asset
|
|
|
|
}
|
|
|
|
|
|
|
|
text := asset.Text
|
|
|
|
replaceNewlines := func() {
|
|
|
|
for {
|
|
|
|
newText := strings.Replace(text, "\n\n\n", "\n\n", -1)
|
|
|
|
if len(newText) == len(text) {
|
|
|
|
break
|
|
|
|
}
|
|
|
|
|
|
|
|
text = newText
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
replaceNewlines()
|
|
|
|
|
|
|
|
firstFunc := functionRegexp.FindStringIndex(text)
|
|
|
|
text = text[firstFunc[0]:]
|
|
|
|
|
|
|
|
text = withRegexp.ReplaceAllString(text, " with (__closure) {")
|
|
|
|
text = environmentRegexp.ReplaceAllString(text, " }).apply(__environment).apply(this, arguments);")
|
|
|
|
|
|
|
|
text = preambleRegexp.ReplaceAllString(text, "")
|
|
|
|
text = postambleRegexp.ReplaceAllString(text, "")
|
|
|
|
|
|
|
|
replaceNewlines()
|
|
|
|
|
|
|
|
return &Asset{Text: text}
|
|
|
|
}
|
|
|
|
|
2017-04-18 20:02:04 +02:00
|
|
|
// GetURIURL returns the underlying URI as a parsed URL, provided it is one. If there was an error parsing the URI, it
|
|
|
|
// will be returned as a non-nil error object.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) GetURIURL() (*url.URL, bool, error) {
|
2017-04-18 20:02:04 +02:00
|
|
|
if uri, isuri := a.GetURI(); isuri {
|
|
|
|
url, err := url.Parse(uri)
|
|
|
|
if err != nil {
|
|
|
|
return nil, true, err
|
|
|
|
}
|
|
|
|
return url, true, nil
|
|
|
|
}
|
|
|
|
return nil, false, nil
|
|
|
|
}
|
|
|
|
|
2017-11-21 02:38:09 +01:00
|
|
|
// Equals returns true if a is value-equal to other. In this case, value equality is determined only by the hash: even
|
|
|
|
// if the contents of two assets come from different sources, they are treated as equal if their hashes match.
|
|
|
|
// Similarly, if the contents of two assets come from the same source but the assets have different hashes, the assets
|
|
|
|
// are not equal.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) Equals(other *Asset) bool {
|
|
|
|
if a == nil {
|
|
|
|
return other == nil
|
|
|
|
} else if other == nil {
|
|
|
|
return false
|
|
|
|
}
|
2017-11-12 20:45:13 +01:00
|
|
|
|
|
|
|
// If we can't get a hash for both assets, treat them as differing.
|
|
|
|
if err := a.EnsureHash(); err != nil {
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
if err := other.EnsureHash(); err != nil {
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
return a.Hash == other.Hash
|
2017-07-17 21:11:15 +02:00
|
|
|
}
|
|
|
|
|
2017-07-17 19:38:57 +02:00
|
|
|
// Serialize returns a weakly typed map that contains the right signature for serialization purposes.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) Serialize() map[string]interface{} {
|
|
|
|
result := map[string]interface{}{
|
Implement more precise delete-before-replace semantics. (#2369)
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
2019-01-28 18:46:30 +01:00
|
|
|
SigKey: AssetSig,
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
if a.Hash != "" {
|
|
|
|
result[AssetHashProperty] = a.Hash
|
|
|
|
}
|
|
|
|
if a.Text != "" {
|
|
|
|
result[AssetTextProperty] = a.Text
|
|
|
|
}
|
|
|
|
if a.Path != "" {
|
|
|
|
result[AssetPathProperty] = a.Path
|
|
|
|
}
|
|
|
|
if a.URI != "" {
|
|
|
|
result[AssetURIProperty] = a.URI
|
|
|
|
}
|
|
|
|
return result
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// DeserializeAsset checks to see if the map contains an asset, using its signature, and if so deserializes it.
|
2017-10-23 00:54:44 +02:00
|
|
|
func DeserializeAsset(obj map[string]interface{}) (*Asset, bool, error) {
|
2017-10-22 22:39:21 +02:00
|
|
|
// If not an asset, return false immediately.
|
Implement more precise delete-before-replace semantics. (#2369)
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
2019-01-28 18:46:30 +01:00
|
|
|
if obj[SigKey] != AssetSig {
|
2017-10-22 22:39:21 +02:00
|
|
|
return &Asset{}, false, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// Else, deserialize the possible fields.
|
|
|
|
var hash string
|
|
|
|
if v, has := obj[AssetHashProperty]; has {
|
|
|
|
hash = v.(string)
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
var text string
|
|
|
|
if v, has := obj[AssetTextProperty]; has {
|
|
|
|
text = v.(string)
|
|
|
|
}
|
|
|
|
var path string
|
|
|
|
if v, has := obj[AssetPathProperty]; has {
|
|
|
|
path = v.(string)
|
|
|
|
}
|
|
|
|
var uri string
|
|
|
|
if v, has := obj[AssetURIProperty]; has {
|
|
|
|
uri = v.(string)
|
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
|
|
|
|
return &Asset{Hash: hash, Text: text, Path: path, URI: uri}, true, nil
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
|
2017-11-12 20:45:13 +01:00
|
|
|
// HasContents indicates whether or not an asset's contents can be read.
|
|
|
|
func (a *Asset) HasContents() bool {
|
|
|
|
return a.IsText() || a.IsPath() || a.IsURI()
|
|
|
|
}
|
|
|
|
|
|
|
|
// Bytes returns the contents of the asset as a byte slice.
|
|
|
|
func (a *Asset) Bytes() ([]byte, error) {
|
|
|
|
// If this is a text asset, just return its bytes directly.
|
|
|
|
if text, istext := a.GetText(); istext {
|
|
|
|
return []byte(text), nil
|
|
|
|
}
|
|
|
|
|
|
|
|
blob, err := a.Read()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
return ioutil.ReadAll(blob)
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Read begins reading an asset.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) Read() (*Blob, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if a.IsText() {
|
|
|
|
return a.readText()
|
|
|
|
} else if a.IsPath() {
|
|
|
|
return a.readPath()
|
|
|
|
} else if a.IsURI() {
|
|
|
|
return a.readURI()
|
|
|
|
}
|
2018-06-18 20:08:17 +02:00
|
|
|
return nil, errors.New("unrecognized asset type")
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) readText() (*Blob, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
text, istext := a.GetText()
|
|
|
|
contract.Assertf(istext, "Expected a text-based asset")
|
|
|
|
return NewByteBlob([]byte(text)), nil
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) readPath() (*Blob, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
path, ispath := a.GetPath()
|
|
|
|
contract.Assertf(ispath, "Expected a path-based asset")
|
2017-10-22 22:39:21 +02:00
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
file, err := os.Open(path)
|
2017-10-22 22:39:21 +02:00
|
|
|
if err != nil {
|
|
|
|
return nil, errors.Wrapf(err, "failed to open asset file '%v'", path)
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Do a quick check to make sure it's a file, so we can fail gracefully if someone passes a directory.
|
|
|
|
info, err := file.Stat()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if err != nil {
|
2017-11-09 00:28:41 +01:00
|
|
|
contract.IgnoreClose(file)
|
|
|
|
return nil, errors.Wrapf(err, "failed to stat asset file '%v'", path)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
if info.IsDir() {
|
|
|
|
contract.IgnoreClose(file)
|
|
|
|
return nil, errors.Errorf("asset path '%v' is a directory; try using an archive", path)
|
|
|
|
}
|
|
|
|
|
|
|
|
blob := &Blob{
|
|
|
|
rd: file,
|
|
|
|
sz: info.Size(),
|
|
|
|
}
|
|
|
|
return blob, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) readURI() (*Blob, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
url, isurl, err := a.GetURIURL()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
contract.Assertf(isurl, "Expected a URI-based asset")
|
|
|
|
switch s := url.Scheme; s {
|
|
|
|
case "http", "https":
|
2018-04-05 21:55:56 +02:00
|
|
|
resp, err := httputil.GetWithRetry(url.String(), http.DefaultClient)
|
2017-05-14 02:04:35 +02:00
|
|
|
if err != nil {
|
2017-04-17 22:34:19 +02:00
|
|
|
return nil, err
|
|
|
|
}
|
2017-05-14 02:04:35 +02:00
|
|
|
return NewReadCloserBlob(resp.Body)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
case "file":
|
|
|
|
contract.Assert(url.User == nil)
|
|
|
|
contract.Assert(url.RawQuery == "")
|
|
|
|
contract.Assert(url.Fragment == "")
|
2017-10-22 22:39:21 +02:00
|
|
|
if url.Host != "" && url.Host != "localhost" {
|
|
|
|
return nil, errors.Errorf("file:// host '%v' not supported (only localhost)", url.Host)
|
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
f, err := os.Open(url.Path)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
2017-04-17 22:34:19 +02:00
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
return NewFileBlob(f)
|
|
|
|
default:
|
|
|
|
return nil, errors.Errorf("Unrecognized or unsupported URI scheme: %v", s)
|
2017-04-17 22:34:19 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-10-23 03:30:42 +02:00
|
|
|
// EnsureHash computes the SHA256 hash of the asset's contents and stores it on the object.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Asset) EnsureHash() error {
|
|
|
|
if a.Hash == "" {
|
|
|
|
blob, err := a.Read()
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
defer contract.IgnoreClose(blob)
|
|
|
|
|
|
|
|
hash := sha256.New()
|
|
|
|
_, err = io.Copy(hash, blob)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
a.Hash = hex.EncodeToString(hash.Sum(nil))
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Blob is a blob that implements ReadCloser and offers Len functionality.
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
type Blob struct {
|
2017-11-09 00:28:41 +01:00
|
|
|
rd io.ReadCloser // an underlying reader.
|
|
|
|
sz int64 // the size of the blob.
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func (blob *Blob) Close() error { return blob.rd.Close() }
|
|
|
|
func (blob *Blob) Read(p []byte) (int, error) { return blob.rd.Read(p) }
|
|
|
|
func (blob *Blob) Size() int64 { return blob.sz }
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
|
|
|
// NewByteBlob creates a new byte blob.
|
|
|
|
func NewByteBlob(data []byte) *Blob {
|
|
|
|
return &Blob{
|
2017-11-09 00:28:41 +01:00
|
|
|
rd: ioutil.NopCloser(bytes.NewReader(data)),
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
sz: int64(len(data)),
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// NewFileBlob creates a new asset blob whose size is known thanks to stat.
|
|
|
|
func NewFileBlob(f *os.File) (*Blob, error) {
|
2017-05-14 02:04:35 +02:00
|
|
|
stat, err := f.Stat()
|
|
|
|
if err != nil {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
return nil, err
|
|
|
|
}
|
2017-05-14 02:04:35 +02:00
|
|
|
return &Blob{
|
|
|
|
rd: f,
|
|
|
|
sz: stat.Size(),
|
|
|
|
}, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// NewReadCloserBlob turn any old ReadCloser into an Blob, usually by making a copy.
|
|
|
|
func NewReadCloserBlob(r io.ReadCloser) (*Blob, error) {
|
|
|
|
if f, isf := r.(*os.File); isf {
|
|
|
|
// If it's a file, we can "fast path" the asset creation without making a copy.
|
|
|
|
return NewFileBlob(f)
|
|
|
|
}
|
|
|
|
// Otherwise, read it all in, and create a blob out of that.
|
Tidy up more lint
This change fixes a few things:
* Most importantly, we need to place a leading "." in the paths
to Gometalinter, otherwise some sub-linters just silently skip
the directory altogether. errcheck is one such linter, which
is a very important one!
* Use an explicit Gometalinter.json file to configure the various
settings. This flips on a few additional linters that aren't
on by default (line line length checking). Sadly, a few that
I'd like to enable take waaaay too much time, so in the future
we may consider a nightly job (this includes code similarity,
unused parameters, unused functions, and others that generally
require global analysis).
* Now that we're running more, however, linting takes a while!
The core Lumi project now takes 26 seconds to lint on my laptop.
That's not terrible, but it's long enough that we don't want to
do the silly "run them twice" thing our Makefiles were previously
doing. Instead, we shall deploy some $$($${PIPESTATUS[1]}-1))-fu
to rely on the fact that grep returns 1 on "zero lines".
* Finally, fix the many issues that this turned up.
I think(?) we are done, except, of course, for needing to drive
down some of the cyclomatic complexity issues (which I'm possibly
going to punt on; see pulumi/lumi#259 for more details).
2017-06-22 21:09:46 +02:00
|
|
|
defer contract.IgnoreClose(r)
|
2017-05-14 02:04:35 +02:00
|
|
|
data, err := ioutil.ReadAll(r)
|
|
|
|
if err != nil {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
return nil, err
|
|
|
|
}
|
2017-05-14 02:04:35 +02:00
|
|
|
return NewByteBlob(data), nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// Archive is a serialized archive reference. It is a union: thus, only one of its fields will be non-nil. Several
|
|
|
|
// helper routines exist as members in order to easily interact with archives of different kinds.
|
|
|
|
type Archive struct {
|
2018-11-01 16:28:11 +01:00
|
|
|
// Sig is the unique archive type signature (see properties.go).
|
|
|
|
Sig string `json:"4dabf18193072939515e22adb298388d" yaml:"4dabf18193072939515e22adb298388d"`
|
|
|
|
// Hash contains the SHA256 hash of the archive's contents.
|
|
|
|
Hash string `json:"hash,omitempty" yaml:"hash,omitempty"`
|
|
|
|
// Assets, when non-nil, is a collection of other assets/archives.
|
|
|
|
Assets map[string]interface{} `json:"assets,omitempty" yaml:"assets,omitempty"`
|
|
|
|
// Path is a non-empty string representing a path to a file on the current filesystem, for file archives.
|
|
|
|
Path string `json:"path,omitempty" yaml:"path,omitempty"`
|
|
|
|
// URI is a non-empty URI (file://, http://, https://, etc), for URI-backed archives.
|
|
|
|
URI string `json:"uri,omitempty" yaml:"uri,omitempty"`
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-07-14 21:28:43 +02:00
|
|
|
const (
|
2017-07-17 19:38:57 +02:00
|
|
|
ArchiveSig = "0def7320c3a5731c473e5ecbe6d01bc7" // a randomly assigned archive type signature.
|
2017-10-22 22:39:21 +02:00
|
|
|
ArchiveHashProperty = "hash" // the dynamic property for an archive's hash.
|
2017-07-17 19:38:57 +02:00
|
|
|
ArchiveAssetsProperty = "assets" // the dynamic property for an archive's assets.
|
|
|
|
ArchivePathProperty = "path" // the dynamic property for an archive's path.
|
|
|
|
ArchiveURIProperty = "uri" // the dynamic property for an archive's URI.
|
2017-07-14 21:28:43 +02:00
|
|
|
)
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func NewAssetArchive(assets map[string]interface{}) (*Archive, error) {
|
|
|
|
// Ensure all elements are either assets or archives.
|
|
|
|
for _, asset := range assets {
|
|
|
|
switch t := asset.(type) {
|
|
|
|
case *Asset, *Archive:
|
|
|
|
// ok
|
|
|
|
default:
|
|
|
|
return &Archive{}, errors.Errorf("type %v is not a valid archive element", t)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
a := &Archive{Sig: ArchiveSig, Assets: assets}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
|
|
|
|
|
|
|
func NewPathArchive(path string) (*Archive, error) {
|
|
|
|
a := &Archive{Sig: ArchiveSig, Path: path}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
|
|
|
|
|
|
|
func NewURIArchive(uri string) (*Archive, error) {
|
|
|
|
a := &Archive{Sig: ArchiveSig, URI: uri}
|
|
|
|
err := a.EnsureHash()
|
|
|
|
return a, err
|
|
|
|
}
|
2017-07-14 21:28:43 +02:00
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) IsAssets() bool { return a.Assets != nil }
|
|
|
|
func (a *Archive) IsPath() bool { return a.Path != "" }
|
|
|
|
func (a *Archive) IsURI() bool { return a.URI != "" }
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) GetAssets() (map[string]interface{}, bool) {
|
2017-07-17 19:38:57 +02:00
|
|
|
if a.IsAssets() {
|
|
|
|
return a.Assets, true
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
return nil, false
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) GetPath() (string, bool) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if a.IsPath() {
|
2017-07-17 19:38:57 +02:00
|
|
|
return a.Path, true
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
return "", false
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) GetURI() (string, bool) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if a.IsURI() {
|
2017-07-17 19:38:57 +02:00
|
|
|
return a.URI, true
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
return "", false
|
|
|
|
}
|
|
|
|
|
|
|
|
// GetURIURL returns the underlying URI as a parsed URL, provided it is one. If there was an error parsing the URI, it
|
|
|
|
// will be returned as a non-nil error object.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) GetURIURL() (*url.URL, bool, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if uri, isuri := a.GetURI(); isuri {
|
|
|
|
url, err := url.Parse(uri)
|
|
|
|
if err != nil {
|
|
|
|
return nil, true, err
|
|
|
|
}
|
|
|
|
return url, true, nil
|
|
|
|
}
|
|
|
|
return nil, false, nil
|
|
|
|
}
|
|
|
|
|
2017-11-21 02:38:09 +01:00
|
|
|
// Equals returns true if a is value-equal to other. In this case, value equality is determined only by the hash: even
|
|
|
|
// if the contents of two archives come from different sources, they are treated as equal if their hashes match.
|
|
|
|
// Similarly, if the contents of two archives come from the same source but the archives have different hashes, the
|
|
|
|
// archives are not equal.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) Equals(other *Archive) bool {
|
|
|
|
if a == nil {
|
|
|
|
return other == nil
|
|
|
|
} else if other == nil {
|
|
|
|
return false
|
|
|
|
}
|
2017-11-12 20:45:13 +01:00
|
|
|
|
|
|
|
// If we can't get a hash for both archives, treat them as differing.
|
|
|
|
if err := a.EnsureHash(); err != nil {
|
|
|
|
return false
|
|
|
|
}
|
|
|
|
if err := other.EnsureHash(); err != nil {
|
2017-07-17 21:11:15 +02:00
|
|
|
return false
|
|
|
|
}
|
2017-11-12 20:45:13 +01:00
|
|
|
return a.Hash == other.Hash
|
2017-07-17 21:11:15 +02:00
|
|
|
}
|
|
|
|
|
2017-07-17 19:38:57 +02:00
|
|
|
// Serialize returns a weakly typed map that contains the right signature for serialization purposes.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) Serialize() map[string]interface{} {
|
|
|
|
result := map[string]interface{}{
|
Implement more precise delete-before-replace semantics. (#2369)
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
2019-01-28 18:46:30 +01:00
|
|
|
SigKey: ArchiveSig,
|
2017-10-22 22:39:21 +02:00
|
|
|
}
|
|
|
|
if a.Hash != "" {
|
|
|
|
result[ArchiveHashProperty] = a.Hash
|
|
|
|
}
|
2017-07-17 19:38:57 +02:00
|
|
|
if a.Assets != nil {
|
2017-10-22 22:39:21 +02:00
|
|
|
assets := make(map[string]interface{})
|
2017-07-17 19:38:57 +02:00
|
|
|
for k, v := range a.Assets {
|
2017-10-22 22:39:21 +02:00
|
|
|
switch t := v.(type) {
|
|
|
|
case *Asset:
|
|
|
|
assets[k] = t.Serialize()
|
|
|
|
case *Archive:
|
|
|
|
assets[k] = t.Serialize()
|
|
|
|
default:
|
|
|
|
contract.Failf("Unrecognized asset map type %v", reflect.TypeOf(t))
|
|
|
|
}
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
result[ArchiveAssetsProperty] = assets
|
|
|
|
}
|
|
|
|
if a.Path != "" {
|
|
|
|
result[ArchivePathProperty] = a.Path
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
if a.URI != "" {
|
|
|
|
result[ArchiveURIProperty] = a.URI
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
return result
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// DeserializeArchive checks to see if the map contains an archive, using its signature, and if so deserializes it.
|
2017-10-23 00:54:44 +02:00
|
|
|
func DeserializeArchive(obj map[string]interface{}) (*Archive, bool, error) {
|
2017-10-22 22:39:21 +02:00
|
|
|
// If not an archive, return false immediately.
|
Implement more precise delete-before-replace semantics. (#2369)
This implements the new algorithm for deciding which resources must be
deleted due to a delete-before-replace operation.
We need to compute the set of resources that may be replaced by a
change to the resource under consideration. We do this by taking the
complete set of transitive dependents on the resource under
consideration and removing any resources that would not be replaced by
changes to their dependencies. We determine whether or not a resource
may be replaced by substituting unknowns for input properties that may
change due to deletion of the resources their value depends on and
calling the resource provider's Diff method.
This is perhaps clearer when described by example. Consider the
following dependency graph:
A
__|__
B C
| _|_
D E F
In this graph, all of B, C, D, E, and F transitively depend on A. It may
be the case, however, that changes to the specific properties of any of
those resources R that would occur if a resource on the path to A were
deleted and recreated may not cause R to be replaced. For example, the
edge from B to A may be a simple dependsOn edge such that a change to
B does not actually influence any of B's input properties. In that case,
neither B nor D would need to be deleted before A could be deleted.
In order to make the above algorithm a reality, the resource monitor
interface has been updated to include a map that associates an input
property key with the list of resources that input property depends on.
Older clients of the resource monitor will leave this map empty, in
which case all input properties will be treated as depending on all
dependencies of the resource. This is probably overly conservative, but
it is less conservative than what we currently implement, and is
certainly correct.
2019-01-28 18:46:30 +01:00
|
|
|
if obj[SigKey] != ArchiveSig {
|
2017-10-22 22:39:21 +02:00
|
|
|
return &Archive{}, false, nil
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
|
|
|
|
var hash string
|
|
|
|
if v, has := obj[ArchiveHashProperty]; has {
|
|
|
|
hash = v.(string)
|
|
|
|
}
|
|
|
|
|
|
|
|
var assets map[string]interface{}
|
2017-07-17 19:38:57 +02:00
|
|
|
if v, has := obj[ArchiveAssetsProperty]; has {
|
2017-10-22 22:39:21 +02:00
|
|
|
assets = make(map[string]interface{})
|
|
|
|
if v != nil {
|
|
|
|
for k, elem := range v.(map[string]interface{}) {
|
|
|
|
switch t := elem.(type) {
|
|
|
|
case *Asset:
|
|
|
|
assets[k] = t
|
|
|
|
case *Archive:
|
|
|
|
assets[k] = t
|
|
|
|
case map[string]interface{}:
|
2017-10-23 00:54:44 +02:00
|
|
|
a, isa, err := DeserializeAsset(t)
|
2017-10-22 22:39:21 +02:00
|
|
|
if err != nil {
|
|
|
|
return &Archive{}, false, err
|
|
|
|
} else if isa {
|
|
|
|
assets[k] = a
|
|
|
|
} else {
|
2017-10-23 00:54:44 +02:00
|
|
|
arch, isarch, err := DeserializeArchive(t)
|
2017-10-22 22:39:21 +02:00
|
|
|
if err != nil {
|
|
|
|
return &Archive{}, false, err
|
|
|
|
} else if !isarch {
|
|
|
|
return &Archive{}, false, errors.Errorf("archive member '%v' is not an asset or archive", k)
|
|
|
|
}
|
|
|
|
assets[k] = arch
|
|
|
|
}
|
|
|
|
default:
|
|
|
|
return &Archive{}, false, nil
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
var path string
|
|
|
|
if v, has := obj[ArchivePathProperty]; has {
|
|
|
|
path = v.(string)
|
|
|
|
}
|
|
|
|
var uri string
|
|
|
|
if v, has := obj[ArchiveURIProperty]; has {
|
|
|
|
uri = v.(string)
|
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
|
|
|
|
return &Archive{Hash: hash, Assets: assets, Path: path, URI: uri}, true, nil
|
2017-07-17 19:38:57 +02:00
|
|
|
}
|
|
|
|
|
2017-11-12 20:45:13 +01:00
|
|
|
// HasContents indicates whether or not an archive's contents can be read.
|
|
|
|
func (a *Archive) HasContents() bool {
|
|
|
|
return a.IsAssets() || a.IsPath() || a.IsURI()
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// ArchiveReader presents the contents of an archive as a stream of named blobs.
|
|
|
|
type ArchiveReader interface {
|
2017-11-21 02:38:09 +01:00
|
|
|
// Next returns the name and contents of the next member of the archive. If there are no more members in the
|
|
|
|
// archive, this function returns ("", nil, io.EOF). The blob returned by a call to Next() must be read in full
|
|
|
|
// before the next call to Next().
|
2017-11-09 00:28:41 +01:00
|
|
|
Next() (string, *Blob, error)
|
|
|
|
|
|
|
|
// Close terminates the stream.
|
|
|
|
Close() error
|
|
|
|
}
|
|
|
|
|
|
|
|
// Open returns an ArchiveReader that can be used to iterate over the named blobs that comprise the archive.
|
|
|
|
func (a *Archive) Open() (ArchiveReader, error) {
|
2017-11-12 20:45:13 +01:00
|
|
|
contract.Assertf(a.HasContents(), "cannot read an archive that has no contents")
|
2017-07-17 19:38:57 +02:00
|
|
|
if a.IsAssets() {
|
|
|
|
return a.readAssets()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
} else if a.IsPath() {
|
|
|
|
return a.readPath()
|
|
|
|
} else if a.IsURI() {
|
|
|
|
return a.readURI()
|
|
|
|
}
|
2018-09-21 20:53:39 +02:00
|
|
|
return nil, errors.New("unrecognized archive type")
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// assetsArchiveReader is used to read an Assets archive.
|
|
|
|
type assetsArchiveReader struct {
|
2018-04-28 02:10:50 +02:00
|
|
|
assets map[string]interface{}
|
|
|
|
keys []string
|
|
|
|
archive ArchiveReader
|
|
|
|
archiveRoot string
|
2017-11-09 00:28:41 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
func (r *assetsArchiveReader) Next() (string, *Blob, error) {
|
|
|
|
for {
|
2017-11-21 02:38:09 +01:00
|
|
|
// If we're currently flattening out a subarchive, first check to see if it has any more members. If it does,
|
|
|
|
// return the next member.
|
2017-11-09 00:28:41 +01:00
|
|
|
if r.archive != nil {
|
|
|
|
name, blob, err := r.archive.Next()
|
|
|
|
switch {
|
|
|
|
case err == io.EOF:
|
|
|
|
// The subarchive is complete. Nil it out and continue on.
|
|
|
|
r.archive = nil
|
|
|
|
case err != nil:
|
|
|
|
// The subarchive produced a legitimate error; return it.
|
|
|
|
return "", nil, err
|
|
|
|
default:
|
|
|
|
// The subarchive produced a valid blob. Return it.
|
2018-04-28 02:10:50 +02:00
|
|
|
return filepath.Join(r.archiveRoot, name), blob, nil
|
2017-11-09 00:28:41 +01:00
|
|
|
}
|
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// If there are no more members in this archive, return io.EOF.
|
|
|
|
if len(r.keys) == 0 {
|
|
|
|
return "", nil, io.EOF
|
|
|
|
}
|
|
|
|
|
|
|
|
// Fetch the next key in the archive and slice it off of the list.
|
|
|
|
name := r.keys[0]
|
|
|
|
r.keys = r.keys[1:]
|
|
|
|
|
|
|
|
asset := r.assets[name]
|
2017-10-22 22:39:21 +02:00
|
|
|
switch t := asset.(type) {
|
|
|
|
case *Asset:
|
2017-11-09 00:28:41 +01:00
|
|
|
// An asset can be produced directly.
|
2017-10-22 22:39:21 +02:00
|
|
|
blob, err := t.Read()
|
|
|
|
if err != nil {
|
2017-11-09 00:28:41 +01:00
|
|
|
return "", nil, errors.Wrapf(err, "failed to expand archive asset '%v'", name)
|
2017-10-22 22:39:21 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
return name, blob, nil
|
2017-10-22 22:39:21 +02:00
|
|
|
case *Archive:
|
2017-11-09 00:28:41 +01:00
|
|
|
// An archive must be flattened into its constituent blobs. Open the archive for reading and loop.
|
|
|
|
archive, err := t.Open()
|
2017-10-22 22:39:21 +02:00
|
|
|
if err != nil {
|
2017-11-09 00:28:41 +01:00
|
|
|
return "", nil, errors.Wrapf(err, "failed to expand sub-archive '%v'", name)
|
2017-06-12 19:15:20 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
r.archive = archive
|
2018-04-28 02:10:50 +02:00
|
|
|
r.archiveRoot = name
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func (r *assetsArchiveReader) Close() error {
|
|
|
|
if r.archive != nil {
|
|
|
|
return r.archive.Close()
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func (a *Archive) readAssets() (ArchiveReader, error) {
|
|
|
|
// To read a map-based archive, just produce a map from each asset to its associated reader.
|
|
|
|
m, isassets := a.GetAssets()
|
|
|
|
contract.Assertf(isassets, "Expected an asset map-based archive")
|
|
|
|
|
|
|
|
// Calculate and sort the list of member names s.t. it is deterministically orderered.
|
|
|
|
keys := make([]string, 0, len(m))
|
|
|
|
for k := range m {
|
|
|
|
keys = append(keys, k)
|
|
|
|
}
|
|
|
|
sort.Strings(keys)
|
|
|
|
|
|
|
|
r := &assetsArchiveReader{
|
|
|
|
assets: m,
|
|
|
|
keys: keys,
|
|
|
|
}
|
|
|
|
return r, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
// directoryArchiveReader is used to read an archive that is represented by a directory in the host filesystem.
|
|
|
|
type directoryArchiveReader struct {
|
|
|
|
directoryPath string
|
|
|
|
assetPaths []string
|
|
|
|
}
|
|
|
|
|
|
|
|
func (r *directoryArchiveReader) Next() (string, *Blob, error) {
|
|
|
|
// If there are no more members in this archive, return io.EOF.
|
|
|
|
if len(r.assetPaths) == 0 {
|
|
|
|
return "", nil, io.EOF
|
|
|
|
}
|
|
|
|
|
|
|
|
// Fetch the next path in the archive and slice it off of the list.
|
|
|
|
assetPath := r.assetPaths[0]
|
|
|
|
r.assetPaths = r.assetPaths[1:]
|
|
|
|
|
|
|
|
// Crop the asset's path s.t. it is relative to the directory path.
|
|
|
|
name, err := filepath.Rel(r.directoryPath, assetPath)
|
|
|
|
if err != nil {
|
|
|
|
return "", nil, err
|
|
|
|
}
|
|
|
|
name = filepath.Clean(name)
|
|
|
|
|
2019-07-01 23:04:24 +02:00
|
|
|
// Replace Windows separators with Linux ones (ToSlash is a no-op on Linux)
|
|
|
|
name = filepath.ToSlash(name)
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Open and return the blob.
|
|
|
|
blob, err := (&Asset{Path: assetPath}).Read()
|
|
|
|
if err != nil {
|
|
|
|
return "", nil, err
|
|
|
|
}
|
|
|
|
return name, blob, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func (r *directoryArchiveReader) Close() error {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func (a *Archive) readPath() (ArchiveReader, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// To read a path-based archive, read that file and use its extension to ascertain what format to use.
|
|
|
|
path, ispath := a.GetPath()
|
|
|
|
contract.Assertf(ispath, "Expected a path-based asset")
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
format := detectArchiveFormat(path)
|
|
|
|
|
|
|
|
if format == NotArchive {
|
|
|
|
// If not an archive, it could be a directory; if so, simply expand it out uncompressed as an archive.
|
|
|
|
info, err := os.Stat(path)
|
|
|
|
if err != nil {
|
|
|
|
return nil, errors.Wrapf(err, "couldn't read archive path '%v'", path)
|
2017-10-24 17:57:01 +02:00
|
|
|
} else if !info.IsDir() {
|
2018-09-21 20:53:39 +02:00
|
|
|
return nil, errors.Errorf("'%v' is neither a recognized archive type nor a directory", path)
|
2017-10-22 22:39:21 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
|
|
|
|
// Accumulate the list of asset paths. This list is ordered deterministically by filepath.Walk.
|
|
|
|
assetPaths := []string{}
|
2017-10-22 22:39:21 +02:00
|
|
|
if walkerr := filepath.Walk(path, func(filePath string, f os.FileInfo, fileerr error) error {
|
|
|
|
// If there was an error, exit.
|
|
|
|
if fileerr != nil {
|
|
|
|
return fileerr
|
|
|
|
}
|
2017-10-24 17:57:01 +02:00
|
|
|
|
2017-11-11 02:43:05 +01:00
|
|
|
// If this is a .pulumi directory, we will skip this by default.
|
|
|
|
// TODO[pulumi/pulumi#122]: when we support .pulumiignore, this will be customizable.
|
|
|
|
if f.Name() == workspace.BookkeepingDir {
|
|
|
|
if f.IsDir() {
|
|
|
|
return filepath.SkipDir
|
|
|
|
}
|
2017-10-22 22:39:21 +02:00
|
|
|
return nil
|
|
|
|
}
|
2017-10-24 17:57:01 +02:00
|
|
|
|
Include symlink'd regular files in directory archives
When constructing an Archive based off a directory path, we would
ignore any symlinks that we saw while walking the file system
collecting files to include in the archive.
A user reported an issue where they were unable to use the
[sharp](https://www.npmjs.com/package/sharp) library from NPM with a
lambda deployed via Pulumi. The problem was that the library includes
native components, and these native components include a bunch of
`*.so` files. As is common, there's a regular file with a name like
`foo.so.1.2.3` and then symlinks to that file with the names
`foo.so.1.2`, `foo.so.1` and `foo.so`. Consumers of these SOs will
try to load the shorter names, i.e. `foo.so` and expect the symlink to
resolve to the actual library.
When these links are not present, upstack code fails to load.
This changes modifies our logic such that if we have a symlink and it
points to a regular file, we include it in the archive. At this time,
we don't try to add it to the archive as a symlink, instead we just
add it as another copy of the regular file. We could explore trying to
include these things as symlinks in archive formats that allow
them (While zip does support this, I'm less sure doing this for zip
files will be a great idea, given the set of tricks we already play to
ensure our zip files are usable across many cloud vendors serverless
offerings, it feels like throwing symlinks into the mix may end up
causing even more downstream weirdness).
We continue to ignore symlinks which point at directories. In
practice, these seem fairly uncommon and doing so lets us not worry
about trying to deal with ensuring we don't end up chasing our tail in
cases where there are circular references.
While this change is in pulumi/pulumi, the downstream resource
providers will need to update their vendored dependencies in order to
pick this up and have it work end to end.
Fixes #2077
2019-01-17 20:15:24 +01:00
|
|
|
// If this was a directory, skip it.
|
|
|
|
if f.IsDir() {
|
2017-11-11 02:43:05 +01:00
|
|
|
return nil
|
2017-10-22 22:39:21 +02:00
|
|
|
}
|
|
|
|
|
Include symlink'd regular files in directory archives
When constructing an Archive based off a directory path, we would
ignore any symlinks that we saw while walking the file system
collecting files to include in the archive.
A user reported an issue where they were unable to use the
[sharp](https://www.npmjs.com/package/sharp) library from NPM with a
lambda deployed via Pulumi. The problem was that the library includes
native components, and these native components include a bunch of
`*.so` files. As is common, there's a regular file with a name like
`foo.so.1.2.3` and then symlinks to that file with the names
`foo.so.1.2`, `foo.so.1` and `foo.so`. Consumers of these SOs will
try to load the shorter names, i.e. `foo.so` and expect the symlink to
resolve to the actual library.
When these links are not present, upstack code fails to load.
This changes modifies our logic such that if we have a symlink and it
points to a regular file, we include it in the archive. At this time,
we don't try to add it to the archive as a symlink, instead we just
add it as another copy of the regular file. We could explore trying to
include these things as symlinks in archive formats that allow
them (While zip does support this, I'm less sure doing this for zip
files will be a great idea, given the set of tricks we already play to
ensure our zip files are usable across many cloud vendors serverless
offerings, it feels like throwing symlinks into the mix may end up
causing even more downstream weirdness).
We continue to ignore symlinks which point at directories. In
practice, these seem fairly uncommon and doing so lets us not worry
about trying to deal with ensuring we don't end up chasing our tail in
cases where there are circular references.
While this change is in pulumi/pulumi, the downstream resource
providers will need to update their vendored dependencies in order to
pick this up and have it work end to end.
Fixes #2077
2019-01-17 20:15:24 +01:00
|
|
|
// If this is a symlink and it points at a directory, skip it. Otherwise continue along. This will mean
|
|
|
|
// that the file will be added to the list of files to archive. When you go to read this archive, you'll
|
|
|
|
// get a copy of the file (instead of a symlink) to some other file in the archive.
|
|
|
|
if f.Mode()&os.ModeSymlink != 0 {
|
|
|
|
fileInfo, statErr := os.Stat(filePath)
|
|
|
|
if statErr != nil {
|
|
|
|
return statErr
|
|
|
|
}
|
|
|
|
|
|
|
|
if fileInfo.IsDir() {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Otherwise, add this asset to the list of paths and keep going.
|
|
|
|
assetPaths = append(assetPaths, filePath)
|
2017-10-22 22:39:21 +02:00
|
|
|
return nil
|
|
|
|
}); walkerr != nil {
|
|
|
|
return nil, walkerr
|
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
|
|
|
|
r := &directoryArchiveReader{
|
|
|
|
directoryPath: path,
|
|
|
|
assetPaths: assetPaths,
|
|
|
|
}
|
|
|
|
return r, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
// Otherwise, it's an archive file, and we will go ahead and open it up and read it.
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
file, err := os.Open(path)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
return readArchive(file, format)
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func (a *Archive) readURI() (ArchiveReader, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// To read a URI-based archive, fetch the contents remotely and use the extension to pick the format to use.
|
|
|
|
url, isurl, err := a.GetURIURL()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
contract.Assertf(isurl, "Expected a URI-based asset")
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
format := detectArchiveFormat(url.Path)
|
|
|
|
if format == NotArchive {
|
2017-06-06 03:11:51 +02:00
|
|
|
// IDEA: support (a) hints and (b) custom providers that default to certain formats.
|
2017-10-22 22:39:21 +02:00
|
|
|
return nil, errors.Errorf("file at URL '%v' is not a recognized archive format", url)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
ar, err := a.openURLStream(url)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
return readArchive(ar, format)
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) openURLStream(url *url.URL) (io.ReadCloser, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
switch s := url.Scheme; s {
|
|
|
|
case "http", "https":
|
2018-04-05 21:55:56 +02:00
|
|
|
resp, err := httputil.GetWithRetry(url.String(), http.DefaultClient)
|
2017-05-14 02:04:35 +02:00
|
|
|
if err != nil {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
return nil, err
|
|
|
|
}
|
2017-05-14 02:04:35 +02:00
|
|
|
return resp.Body, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
case "file":
|
|
|
|
contract.Assert(url.Host == "")
|
|
|
|
contract.Assert(url.User == nil)
|
|
|
|
contract.Assert(url.RawQuery == "")
|
|
|
|
contract.Assert(url.Fragment == "")
|
|
|
|
return os.Open(url.Path)
|
|
|
|
default:
|
|
|
|
return nil, errors.Errorf("Unrecognized or unsupported URI scheme: %v", s)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-05-01 18:38:23 +02:00
|
|
|
// Bytes fetches the archive contents as a byte slices. This is almost certainly the least efficient way to deal with
|
|
|
|
// the underlying streaming capabilities offered by assets and archives, but can be used in a pinch to interact with
|
|
|
|
// APIs that demand []bytes.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) Bytes(format ArchiveFormat) ([]byte, error) {
|
2017-05-01 18:38:23 +02:00
|
|
|
var data bytes.Buffer
|
2017-06-12 19:15:20 +02:00
|
|
|
if err := a.Archive(format, &data); err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
return data.Bytes(), nil
|
2017-05-01 18:38:23 +02:00
|
|
|
}
|
|
|
|
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// Archive produces a single archive stream in the desired format. It prefers to return the archive with as little
|
|
|
|
// copying as is feasible, however if the desired format is different from the source, it will need to translate.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) Archive(format ArchiveFormat, w io.Writer) error {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// If the source format is the same, just return that.
|
|
|
|
if sf, ss, err := a.ReadSourceArchive(); sf != NotArchive && sf == format {
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
_, err := io.Copy(w, ss)
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
|
|
|
switch format {
|
|
|
|
case TarArchive:
|
|
|
|
return a.archiveTar(w)
|
|
|
|
case TarGZIPArchive:
|
|
|
|
return a.archiveTarGZIP(w)
|
|
|
|
case ZIPArchive:
|
|
|
|
return a.archiveZIP(w)
|
|
|
|
default:
|
|
|
|
contract.Failf("Illegal archive type: %v", format)
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-11-21 02:38:09 +01:00
|
|
|
// addNextFileToTar adds the next file in the given archive to the given tar file. Returns io.EOF if the archive
|
|
|
|
// contains no more files.
|
2018-08-16 07:44:55 +02:00
|
|
|
func addNextFileToTar(r ArchiveReader, tw *tar.Writer, seenFiles map[string]bool) error {
|
2017-11-09 00:28:41 +01:00
|
|
|
file, data, err := r.Next()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
defer contract.IgnoreClose(data)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2018-08-16 07:44:55 +02:00
|
|
|
// It's possible to run into the same file multiple times in the list of archives we're passed.
|
|
|
|
// For example, if there is an archive pointing to foo/bar and an archive pointing to
|
|
|
|
// foo/bar/baz/quux. Because of this only include the file the first time we see it.
|
|
|
|
if _, has := seenFiles[file]; has {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
seenFiles[file] = true
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
sz := data.Size()
|
|
|
|
if err = tw.WriteHeader(&tar.Header{
|
|
|
|
Name: file,
|
|
|
|
Mode: 0600,
|
|
|
|
Size: sz,
|
|
|
|
}); err != nil {
|
|
|
|
return err
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-11-10 20:34:08 +01:00
|
|
|
n, err := io.Copy(tw, data)
|
|
|
|
if err == tar.ErrWriteTooLong {
|
|
|
|
return errors.Wrap(err, fmt.Sprintf("incorrect blob size for %v: expected %v, got %v", file, sz, n))
|
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
|
|
|
func (a *Archive) archiveTar(w io.Writer) error {
|
|
|
|
// Open the archive.
|
|
|
|
reader, err := a.Open()
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
defer contract.IgnoreClose(reader)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
|
|
|
// Now actually emit the contents, file by file.
|
|
|
|
tw := tar.NewWriter(w)
|
2018-08-16 07:44:55 +02:00
|
|
|
seenFiles := make(map[string]bool)
|
2017-11-09 00:28:41 +01:00
|
|
|
for err == nil {
|
2018-08-16 07:44:55 +02:00
|
|
|
err = addNextFileToTar(reader, tw, seenFiles)
|
2017-11-09 00:28:41 +01:00
|
|
|
}
|
|
|
|
if err != io.EOF {
|
|
|
|
return err
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
return tw.Close()
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) archiveTarGZIP(w io.Writer) error {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
z := gzip.NewWriter(w)
|
|
|
|
return a.archiveTar(z)
|
|
|
|
}
|
|
|
|
|
2017-11-21 02:38:09 +01:00
|
|
|
// addNextFileToZIP adds the next file in the given archive to the given ZIP file. Returns io.EOF if the archive
|
|
|
|
// contains no more files.
|
2018-08-16 07:44:55 +02:00
|
|
|
func addNextFileToZIP(r ArchiveReader, zw *zip.Writer, seenFiles map[string]bool) error {
|
2017-11-09 00:28:41 +01:00
|
|
|
file, data, err := r.Next()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
2017-12-12 18:52:11 +01:00
|
|
|
defer contract.IgnoreClose(data)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2018-08-16 07:44:55 +02:00
|
|
|
// It's possible to run into the same file multiple times in the list of archives we're passed.
|
|
|
|
// For example, if there is an archive pointing to foo/bar and an archive pointing to
|
|
|
|
// foo/bar/baz/quux. Because of this only include the file the first time we see it.
|
|
|
|
if _, has := seenFiles[file]; has {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
seenFiles[file] = true
|
|
|
|
|
2018-06-07 21:11:12 +02:00
|
|
|
fh := &zip.FileHeader{
|
|
|
|
// These are the two fields set by zw.Create()
|
|
|
|
Name: file,
|
|
|
|
Method: zip.Deflate,
|
|
|
|
}
|
|
|
|
|
2018-06-09 00:34:36 +02:00
|
|
|
// Set a nonzero -- but constant -- modification time. Otherwise, some agents (e.g. Azure
|
|
|
|
// websites) can't extract the resulting archive. The date is comfortably after 1980 because
|
|
|
|
// the ZIP format includes a date representation that starts at 1980. Use `SetModTime` to
|
|
|
|
// remain compatible with Go 1.9.
|
2018-06-11 23:32:27 +02:00
|
|
|
// nolint: megacheck
|
2018-06-09 00:34:36 +02:00
|
|
|
fh.SetModTime(time.Date(1990, time.January, 1, 0, 0, 0, 0, time.UTC))
|
2018-06-07 21:11:12 +02:00
|
|
|
|
|
|
|
fw, err := zw.CreateHeader(fh)
|
2017-11-09 00:28:41 +01:00
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
_, err = io.Copy(fw, data)
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
|
|
|
func (a *Archive) archiveZIP(w io.Writer) error {
|
|
|
|
// Open the archive.
|
|
|
|
reader, err := a.Open()
|
|
|
|
if err != nil {
|
|
|
|
return err
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
defer contract.IgnoreClose(reader)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
|
|
|
// Now actually emit the contents, file by file.
|
|
|
|
zw := zip.NewWriter(w)
|
2018-08-16 07:44:55 +02:00
|
|
|
seenFiles := make(map[string]bool)
|
2017-11-09 00:28:41 +01:00
|
|
|
for err == nil {
|
2018-08-16 07:44:55 +02:00
|
|
|
err = addNextFileToZIP(reader, zw, seenFiles)
|
2017-11-09 00:28:41 +01:00
|
|
|
}
|
|
|
|
if err != io.EOF {
|
|
|
|
return err
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
return zw.Close()
|
|
|
|
}
|
|
|
|
|
2017-10-22 22:39:21 +02:00
|
|
|
// ReadSourceArchive returns a stream to the underlying archive, if there is one.
|
|
|
|
func (a *Archive) ReadSourceArchive() (ArchiveFormat, io.ReadCloser, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
if path, ispath := a.GetPath(); ispath {
|
2017-10-22 22:39:21 +02:00
|
|
|
if format := detectArchiveFormat(path); format != NotArchive {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
f, err := os.Open(path)
|
|
|
|
return format, f, err
|
|
|
|
}
|
2017-06-14 01:47:55 +02:00
|
|
|
} else if url, isurl, urlerr := a.GetURIURL(); urlerr == nil && isurl {
|
2017-10-22 22:39:21 +02:00
|
|
|
if format := detectArchiveFormat(url.Path); format != NotArchive {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
s, err := a.openURLStream(url)
|
|
|
|
return format, s, err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return NotArchive, nil, nil
|
|
|
|
}
|
|
|
|
|
2017-10-23 03:30:42 +02:00
|
|
|
// EnsureHash computes the SHA256 hash of the archive's contents and stores it on the object.
|
2017-10-22 22:39:21 +02:00
|
|
|
func (a *Archive) EnsureHash() error {
|
|
|
|
if a.Hash == "" {
|
|
|
|
hash := sha256.New()
|
|
|
|
|
|
|
|
// Attempt to compute the hash in the most efficient way. First try to open the archive directly and copy it
|
|
|
|
// to the hash. This avoids traversing any of the contents and just treats it as a byte stream.
|
|
|
|
f, r, err := a.ReadSourceArchive()
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
if f != NotArchive && r != nil {
|
|
|
|
defer contract.IgnoreClose(r)
|
|
|
|
_, err = io.Copy(hash, r)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
// Otherwise, it's not an archive; we'll need to transform it into one. Pick tar since it avoids
|
|
|
|
// any superfluous compression which doesn't actually help us in this situation.
|
|
|
|
err := a.Archive(TarArchive, hash)
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// Finally, encode the resulting hash as a string and we're done.
|
|
|
|
a.Hash = hex.EncodeToString(hash.Sum(nil))
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// ArchiveFormat indicates what archive and/or compression format an archive uses.
|
|
|
|
type ArchiveFormat int
|
|
|
|
|
|
|
|
const (
|
|
|
|
NotArchive = iota // not an archive.
|
|
|
|
TarArchive // a POSIX tar archive.
|
|
|
|
TarGZIPArchive // a POSIX tar archive that has been subsequently compressed using GZip.
|
|
|
|
ZIPArchive // a multi-file ZIP archive.
|
|
|
|
)
|
|
|
|
|
|
|
|
// ArchiveExts maps from a file extension and its associated archive and/or compression format.
|
|
|
|
var ArchiveExts = map[string]ArchiveFormat{
|
|
|
|
".tar": TarArchive,
|
|
|
|
".tgz": TarGZIPArchive,
|
|
|
|
".tar.gz": TarGZIPArchive,
|
|
|
|
".zip": ZIPArchive,
|
|
|
|
}
|
|
|
|
|
|
|
|
// detectArchiveFormat takes a path and infers its archive format based on the file extension.
|
2017-10-22 22:39:21 +02:00
|
|
|
func detectArchiveFormat(path string) ArchiveFormat {
|
2019-03-26 22:47:29 +01:00
|
|
|
for ext, typ := range ArchiveExts {
|
|
|
|
if strings.HasSuffix(path, ext) {
|
|
|
|
return typ
|
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2019-03-26 22:47:29 +01:00
|
|
|
|
|
|
|
return NotArchive
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
// readArchive takes a stream to an existing archive and returns a map of names to readers for the inner assets.
|
|
|
|
// The routine returns an error if something goes wrong and, no matter what, closes the stream before returning.
|
2017-11-09 00:28:41 +01:00
|
|
|
func readArchive(ar io.ReadCloser, format ArchiveFormat) (ArchiveReader, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
switch format {
|
|
|
|
case TarArchive:
|
|
|
|
return readTarArchive(ar)
|
|
|
|
case TarGZIPArchive:
|
|
|
|
return readTarGZIPArchive(ar)
|
|
|
|
case ZIPArchive:
|
2017-11-09 00:28:41 +01:00
|
|
|
// Unfortunately, the ZIP archive reader requires ReaderAt functionality. If it's a file, we can recover this
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// with a simple stat. Otherwise, we will need to go ahead and make a copy in memory.
|
|
|
|
var ra io.ReaderAt
|
|
|
|
var sz int64
|
|
|
|
if f, isf := ar.(*os.File); isf {
|
2017-05-14 02:04:35 +02:00
|
|
|
stat, err := f.Stat()
|
|
|
|
if err != nil {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
return nil, err
|
|
|
|
}
|
2017-05-14 02:04:35 +02:00
|
|
|
ra = f
|
|
|
|
sz = stat.Size()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
} else if data, err := ioutil.ReadAll(ar); err != nil {
|
|
|
|
return nil, err
|
|
|
|
} else {
|
|
|
|
ra = bytes.NewReader(data)
|
|
|
|
sz = int64(len(data))
|
|
|
|
}
|
|
|
|
return readZIPArchive(ra, sz)
|
|
|
|
default:
|
|
|
|
contract.Failf("Illegal archive type: %v", format)
|
|
|
|
return nil, nil
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// tarArchiveReader is used to read an archive that is stored in tar format.
|
|
|
|
type tarArchiveReader struct {
|
|
|
|
ar io.ReadCloser
|
|
|
|
tr *tar.Reader
|
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func (r *tarArchiveReader) Next() (string, *Blob, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
for {
|
2017-11-09 00:28:41 +01:00
|
|
|
file, err := r.tr.Next()
|
|
|
|
if err != nil {
|
|
|
|
return "", nil, err
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
switch file.Typeflag {
|
|
|
|
case tar.TypeDir:
|
|
|
|
continue // skip directories
|
|
|
|
case tar.TypeReg:
|
2017-11-10 20:34:08 +01:00
|
|
|
// Return the tar reader for this file's contents.
|
2017-11-09 00:28:41 +01:00
|
|
|
data := &Blob{
|
2017-11-10 20:34:08 +01:00
|
|
|
rd: ioutil.NopCloser(r.tr),
|
|
|
|
sz: file.Size,
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-10-24 17:57:01 +02:00
|
|
|
name := filepath.Clean(file.Name)
|
2017-11-09 00:28:41 +01:00
|
|
|
return name, data, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
default:
|
|
|
|
contract.Failf("Unrecognized tar header typeflag: %v", file.Typeflag)
|
|
|
|
}
|
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func (r *tarArchiveReader) Close() error {
|
|
|
|
return r.ar.Close()
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func readTarArchive(ar io.ReadCloser) (ArchiveReader, error) {
|
|
|
|
r := &tarArchiveReader{
|
|
|
|
ar: ar,
|
|
|
|
tr: tar.NewReader(ar),
|
|
|
|
}
|
|
|
|
return r, nil
|
|
|
|
}
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
func readTarGZIPArchive(ar io.ReadCloser) (ArchiveReader, error) {
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
// First decompress the GZIP stream.
|
|
|
|
gz, err := gzip.NewReader(ar)
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
|
|
|
// Now read the tarfile.
|
|
|
|
return readTarArchive(gz)
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// zipArchiveReader is used to read an archive that is stored in ZIP format.
|
|
|
|
type zipArchiveReader struct {
|
|
|
|
ar io.ReaderAt
|
|
|
|
zr *zip.Reader
|
|
|
|
index int
|
|
|
|
}
|
|
|
|
|
|
|
|
func (r *zipArchiveReader) Next() (string, *Blob, error) {
|
|
|
|
for r.index < len(r.zr.File) {
|
|
|
|
file := r.zr.File[r.index]
|
|
|
|
r.index++
|
|
|
|
|
2017-10-24 17:57:01 +02:00
|
|
|
// Skip directories, since they aren't included in TAR and other archives above.
|
|
|
|
if file.FileInfo().IsDir() {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
|
2017-11-09 00:28:41 +01:00
|
|
|
// Open the next file and return its blob.
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
body, err := file.Open()
|
|
|
|
if err != nil {
|
2017-11-09 00:28:41 +01:00
|
|
|
return "", nil, errors.Wrapf(err, "failed to read ZIP inner file %v", file.Name)
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
blob := &Blob{
|
|
|
|
rd: body,
|
|
|
|
sz: int64(file.UncompressedSize64),
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-10-24 17:57:01 +02:00
|
|
|
name := filepath.Clean(file.Name)
|
2017-11-09 00:28:41 +01:00
|
|
|
return name, blob, nil
|
|
|
|
}
|
|
|
|
return "", nil, io.EOF
|
|
|
|
}
|
|
|
|
|
|
|
|
func (r *zipArchiveReader) Close() error {
|
|
|
|
if c, ok := r.ar.(io.Closer); ok {
|
|
|
|
return c.Close()
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func readZIPArchive(ar io.ReaderAt, size int64) (ArchiveReader, error) {
|
|
|
|
zr, err := zip.NewReader(ar, size)
|
|
|
|
if err != nil {
|
|
|
|
return nil, errors.Wrap(err, "failed to read ZIP")
|
|
|
|
}
|
|
|
|
|
|
|
|
r := &zipArchiveReader{
|
|
|
|
ar: ar,
|
|
|
|
zr: zr,
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|
2017-11-09 00:28:41 +01:00
|
|
|
return r, nil
|
Implement archives
Our initial implementation of assets was intentionally naive, because
they were limited to single-file assets. However, it turns out that for
real scenarios (like lambdas), we want to support multi-file assets.
In this change, we introduce the concept of an Archive. An archive is
what the term classically means: a collection of files, addressed as one.
For now, we support three kinds: tarfile archives (*.tar), gzip-compressed
tarfile archives (*.tgz, *.tar), and normal zipfile archives (*.zip).
There is a fair bit of library support for manipulating Archives as a
logical collection of Assets. I've gone to great length to avoid making
copies, however, sometimes it is unavoidable (for example, when sizes
are required in order to emit offsets). This is also complicated by the
fact that the AWS libraries often want seekable streams, if not actual
raw contiguous []byte slices.
2017-04-30 21:37:24 +02:00
|
|
|
}
|