Wicker Schemas¶
Every Wicker dataset has an associated schema which is declared at schema write-time.
Wicker schemas are Python objects which are serialized in storage as Avro-compatible JSON files.
When declaring schemas, we use the wicker.schema.DatasetSchema
object:
from wicker.schema import DatasetSchema
my_schema = DatasetSchema(
primary_keys=["foo", "bar"],
fields=[...],
)
Your schema must be defined with a set of primary_keys. Your primary keys must be the names of string, float, int or bool fields in your schema, and will be used to order your dataset.
Schema Fields¶
Here is a list of Schema fields that Wicker provides. Most notably, users can implement custom fields
by implementing their own codecs and using the ObjectField
.
- class wicker.schema.schema.ArrayField(element_field: wicker.schema.schema.SchemaField, required: bool = True)¶
Bases:
wicker.schema.schema.SchemaField
A field that contains an array of values (that have the same schema type)
- class wicker.schema.schema.BoolField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a boolean
- class wicker.schema.schema.BytesField(name: str, description: str = '', required: bool = True, is_heavy_pointer: bool = True)¶
Bases:
wicker.schema.schema.ObjectField
An ObjectField that uses a no-op Codec for encoding pure bytes
- class wicker.schema.schema.DoubleField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a 64-bit double
- class wicker.schema.schema.FloatField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a 32-bit float
- class wicker.schema.schema.IntField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a 32-bit int
- class wicker.schema.schema.LongField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a 64-bit long
- class wicker.schema.schema.NumpyField(name: str, shape: Optional[Tuple[int, ...]], dtype: str, description: str = '', required: bool = True, is_heavy_pointer: bool = True)¶
Bases:
wicker.schema.schema.ObjectField
An ObjectField that uses a Codec for encoding Numpy arrays
- class wicker.schema.schema.ObjectField(name: str, codec: wicker.schema.codecs.Codec, description: str = '', required: bool = True, is_heavy_pointer: bool = True)¶
Bases:
wicker.schema.schema.SchemaField
A field that contains objects of a specific type.
- property is_heavy_pointer: bool¶
Whether to store this field as a separate file
- class wicker.schema.schema.RecordField(name: str, fields: List[wicker.schema.schema.SchemaField], description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {}, top_level: bool = False)¶
Bases:
wicker.schema.schema.SchemaField
A field that contains nested fields
- class wicker.schema.schema.StringField(name: str, description: str = '', required: bool = True, custom_field_tags: Dict[str, str] = {})¶
Bases:
wicker.schema.schema.SchemaField
A field that stores a string