datasets
Creates, updates, deletes or gets a dataset
resource or lists datasets
in a region
Overview
Name | datasets |
Type | Resource |
Description | Resource schema for AWS::DataBrew::Dataset. |
Id | aws.databrew.datasets |
Fields
Name | Datatype | Description |
---|---|---|
name | string | Dataset name |
format | string | Dataset format |
format_options | object | Format options for dataset |
input | object | Input |
source | string | Source type of the dataset |
path_options | object | PathOptions |
tags | array | |
region | string | AWS region. |
For more information, see AWS::DataBrew::Dataset
.
Methods
Name | Accessible by | Required Params |
---|---|---|
create_resource | INSERT | Name, Input, region |
delete_resource | DELETE | data__Identifier, region |
update_resource | UPDATE | data__Identifier, data__PatchDocument, region |
list_resources | SELECT | region |
get_resource | SELECT | data__Identifier, region |
SELECT
examples
Gets all datasets
in a region.
SELECT
region,
name,
format,
format_options,
input,
source,
path_options,
tags
FROM aws.databrew.datasets
WHERE region = 'us-east-1';
Gets all properties from an individual dataset
.
SELECT
region,
name,
format,
format_options,
input,
source,
path_options,
tags
FROM aws.databrew.datasets
WHERE region = 'us-east-1' AND data__Identifier = '<Name>';
INSERT
example
Use the following StackQL query and manifest file to create a new dataset
resource, using stack-deploy
.
- Required Properties
- All Properties
- Manifest
/*+ create */
INSERT INTO aws.databrew.datasets (
Name,
Input,
region
)
SELECT
'{{ Name }}',
'{{ Input }}',
'{{ region }}';
/*+ create */
INSERT INTO aws.databrew.datasets (
Name,
Format,
FormatOptions,
Input,
Source,
PathOptions,
Tags,
region
)
SELECT
'{{ Name }}',
'{{ Format }}',
'{{ FormatOptions }}',
'{{ Input }}',
'{{ Source }}',
'{{ PathOptions }}',
'{{ Tags }}',
'{{ region }}';
version: 1
name: stack name
description: stack description
providers:
- aws
globals:
- name: region
value: '{{ vars.AWS_REGION }}'
resources:
- name: dataset
props:
- name: Name
value: '{{ Name }}'
- name: Format
value: '{{ Format }}'
- name: FormatOptions
value:
Json:
MultiLine: '{{ MultiLine }}'
Excel:
SheetNames:
- '{{ SheetNames[0] }}'
SheetIndexes:
- '{{ SheetIndexes[0] }}'
HeaderRow: '{{ HeaderRow }}'
Csv:
Delimiter: '{{ Delimiter }}'
HeaderRow: '{{ HeaderRow }}'
- name: Input
value:
S3InputDefinition:
Bucket: '{{ Bucket }}'
Key: '{{ Key }}'
DataCatalogInputDefinition:
CatalogId: '{{ CatalogId }}'
DatabaseName: '{{ DatabaseName }}'
TableName: '{{ TableName }}'
TempDirectory: null
DatabaseInputDefinition:
GlueConnectionName: '{{ GlueConnectionName }}'
DatabaseTableName: '{{ DatabaseTableName }}'
TempDirectory: null
QueryString: '{{ QueryString }}'
Metadata:
SourceArn: '{{ SourceArn }}'
- name: Source
value: '{{ Source }}'
- name: PathOptions
value:
FilesLimit:
MaxFiles: '{{ MaxFiles }}'
OrderedBy: '{{ OrderedBy }}'
Order: '{{ Order }}'
LastModifiedDateCondition:
Expression: '{{ Expression }}'
ValuesMap:
- ValueReference: '{{ ValueReference }}'
Value: '{{ Value }}'
Parameters:
- PathParameterName: '{{ PathParameterName }}'
DatasetParameter:
Name: null
Type: '{{ Type }}'
DatetimeOptions:
Format: '{{ Format }}'
TimezoneOffset: '{{ TimezoneOffset }}'
LocaleCode: '{{ LocaleCode }}'
CreateColumn: '{{ CreateColumn }}'
Filter: null
- name: Tags
value:
- Key: '{{ Key }}'
Value: '{{ Value }}'
DELETE
example
/*+ delete */
DELETE FROM aws.databrew.datasets
WHERE data__Identifier = '<Name>'
AND region = 'us-east-1';
Permissions
To operate on the datasets
resource, the following permissions are required:
Create
databrew:CreateDataset,
databrew:TagResource,
databrew:UntagResource,
glue:GetConnection,
glue:GetTable,
iam:PassRole
Read
databrew:DescribeDataset,
databrew:ListTagsForResource,
iam:ListRoles
Update
databrew:UpdateDataset,
glue:GetConnection,
glue:GetTable
Delete
databrew:DeleteDataset
List
databrew:ListDatasets,
databrew:ListTagsForResource,
iam:ListRoles