Skip to content

Reimplement Arrow based intermediate records #45

@syucream

Description

@syucream

retry to implement Arrow record typed intermediate representation, once more! I think we can gradually switch to that by below steps:

  • prototyping for PoC: (various inputs) -> map's -> arrow -> map's -> json -> parquet

    • implement Arrow -> JSON conversion in Go
    • integrate it easily
  • remove parquet writing side Go intermediates: (various inputs) -> map's -> arrow -> json -> parquet

  • remove input side Go intermediates: (various inputs) -> arrow -> json -> parquet

    • It requires input -> arrow formatter for each input types
  • ideal: (various inputs) -> arrow -> parquet

    • It's so complicated because of arrow -> parquet (a part depends on parquet-go)
    • It'll require some improvements of Arrow Go implementation.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions