Amazon S3 Connector

Amazon S3 or Amazon Simple Storage Service is a service offered by Amazon Web Services that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its global e-commerce network.

More Info on Amazon Web Service


Authentication Connection

Authentication connectors are used to authenticate repository/output connections that need certain authentication fields like access tokens or refresh tokens. Click here for more information on setting up authentication connections.

Configuration

  • Name: Unique name for this auth connector.

  • Client ID: The Access Key to connect to the client. For more information about AWS Access Keys, please visit this link.

  • Client Secret: The Secret key associated with the above Access Key.

  • S3 Region: The AWS Region where your instance is located, It will be in the AWS console. default is us-east-1

  • End Point: If using Amazon Glacier, set your instances' url here. It will override the region.

  • Connection Timeout: Set the connection timeout. Higher values may be needed when moving large files.

Tip:  INSTALLED AWS CREDENTIALS
If you leave the Client ID and Client Secret empty, 3Sixty will attempt to authenticate with your installed AWS credentials

Proxy Information

This tab is for if you're connecting through a proxy, and is optional.

  • Proxy User: The proxy user to use. (Optional)

  • Proxy Password: The password for the proxy user. (leave blank if no proxy)

  • Proxy Protocol: The HTTP(S) Protocol to use to connect to the proxy.

  • Full Proxy Url: The Proxy Host (leave blank if no proxy).

  • Proxy Port: The port to connect to on the proxy. (Optional)

  • Proxy Domain: The Domain for the proxy.

  • Proxy Workstation: The workstation to use.


Integration Connection

Most Integration Connections can act in both repository (read) and output (write) modes. If it can't, it will not appear as an option when creating or editing a job. This connection can only be used as a repository connection. Click here for more information on setting up an integration connection.

Configuration

  • Description: Description for this connection

  • Authentication Connection: Your Amazon Auth connector


Job Configuration

Specification Tab: S3 Folders (Repo)

  • List of S3 Keys: A comma delimited keys of s3 keys (folders) to crawl.

  • Bucket Name: The bucket where the keys are located

  • Metadata is stored is separate files with the suffix .metadata.properties.xml: If this box isn't checked metadata will not be stored.

  • Retrieve File Tags: File tags will be added as metadata with prefix "tag."

  • Max Connections: The maximum number of connections to S3 allowed. Increasing this number will result in faster transfers.

Specification Tab: S3 Basic Configuration (Output)

Tip:  There are no actual folders in S3. All files in S3 have a "key", which includes their entire path. The folder path and bucket properties simply prepend these values to each files' keys

  • Output Folder Path: Output folder key. Will be prepended to all document parent paths to make keys.

  • Bucket Name: The bucket name that will be prepended to all keys.

  • Includes Unmapped Properties: Will apply all metadata on the document without mapping

  • Use GZip: Sets whether gzip decompression should be used when receiving HTTP responses.

  • Do not generate XML when Outputting to S3: Like the BFS Connector, the S3 Connector outputs metadata as separate files in the for of [filename].metadata.properties.xml. Check this box if you wish for it to only output files.

  • Use Transfer Manager: If migrating larger files, the S3 APIs offer a transfer manager to ensure more stable uploads

  • Stage Binary to Filesystem: To avoid issues with disconnects from the source, this will temporarily store file content in the Tomcat temp folder before uploading it.

  • Date/DateTime Format: How to format the mapped fields of this type before upload.

Important:  
If migrating large files to S3 it is recommended that you check Use Transfer Manager AND Stage Binary to Filesystem. If you use the Transfer Manager without staging the file, all file uploads will be single threaded by the Transfer Manager.

Specification Tab: S3 Advanced Configuration (Output)

  • Max Connections: The maximum number of connections the client can open. Adjusting this can cause changes in performance

  • Multi-value Separator: Some documents have fields that contain multiple values.S3 does not support this, and will use this separator to form a list of these values as a string before upload.

  • Use KMS to encrypt your objects: S3 Buckets use SSE-S3 by default to encrypt objects if you do not specify KMS for encryption.

    • KMS key ARN: If you check the box to use KMS to encrypt your objects you must provide the KMS key ARN.

  • Disable Chunked Encoding: Will remove the transfer-encoding:chunked header from all requests

  • Set Path Style Access: Refer to Amazon's page for more information on this option

  • Object Metadata Fields: A Comma delimited list of fields to add to the S3 Object as User Metadata.


Content Service Connection

This section covers the S3 specific configuration of the Content Service Connector. For a description of how to set up a content services connector generically see Content Service Connectors.

Configuration

This section covers the S3 specific configuration of the Content Service Connector.

Tip:  S3 DOCUMENT IDS
S3 file ids always take the form of /bucket/(key).

  • Bucket Name: The target bucket for creating a file.

  • Output Folder Path: The key of the folder to target when creating a file.

  • ACL Name: Canned ACL to add to all new content uploaded via this connection.

  • Content Disposition: Default Disposition of any content added via this connection. Will be added to objects metadata

Supported Methods

  • createFile - will take full /bucket/key as folderId parameter to bucket and folder configuration

  • deleteACL

  • deleteObjectByID

  • getACLs

  • getFileContent

  • getObjectProperties

  • listFolderItems

  • lockDocument

  • setACLs

  • unlockDocument

  • updateFile

  • updateProperties

Tip:  S3 ACCESS CONTROL
See this page for information on grantees and permissions.


Need help integrating S3? We can help.