Skip to content

Changelog

Unreleased

0.10.0 - 2026-04-02

Added

  • Add server_info() and domains() methods to Client and top-level API for querying server metadata and available domains
  • Add ChainedFlightReader for sequential reading across time-ordered endpoint slices, replacing MultiFlightReader for fetch requests
  • Add time parameter to find(), count(), and describe() for time-aware metadata queries against historical backends
  • Cli

    add proxy command for wrapping subprocesses in an SSH proxy

  • Cli

    add format options for publish --list

Fixed

  • BlockMuxStream: reject blocks with mismatched strides instead of silently muxing incorrect data
  • TimedQueue: allow partial-stride edge blocks for non-aligned requests
  • Channel: coerce all numeric-like dtype arguments to strings, not just numpy.dtype instances
  • Support historical streams without max_latency — no longer required for channels when streaming

Changed

  • Publisher: add validation checks for monotonically increasing timestamps and consistent block stride; improve error messages for metadata mismatches
  • Bump minimum arrakis-schema version to 0.3
  • Cli

    rename log level environment variable from LOG_LEVEL to ARRAKIS_LOG_LEVEL, default to INFO

0.9.0 - 2026-03-26

Added

  • Cli

    add table output format using rich, with a live-updating table for arrakis stream output

  • Cli

    add ability to read channels from stdin

  • Cli

    add --format option to count command

Fixed

  • TimedQueue: fix drain_until not advancing last_time, causing queue misalignment
  • TimedQueue: clear stale timeout gaps on first real data
  • TimedQueue: reset last_time when queue is drained instead of preserving old value
  • Guard pull() with ready() when reader is done in BlockMuxStream.stream()
  • Skip timeout gap-filling when there is only a single queue
  • Update timeouts before aligning queues in muxer

Changed

  • Cli

    make table the default output format for all commands

  • TimedQueue: make update_timeout() public for use in muxer

0.8.0 - 2026-03-11

Added

  • Cli

    add --duration option to fetch as an alternative to --end

  • Cli

    add --ssh-proxy option to tunnel connections through an SSH host

Fixed

  • Fix issue with duplicate gap channels in blocks in the muxer
  • Fix muxer logic in BlockMuxStream.stream() to yield blocks as they become ready during reads, improving latency for streaming data

Changed

  • Remove old Muxer and MuxedData classes, replaced by BlockMuxStream

0.7.0 - 2026-01-26

Added

  • Add functionality to Channel class:
  • Add as_dict method to Channel
  • Add stride and max_latency attributes
  • Add fields() method to return list of attribute fields
  • Add SeriesBlock.full_gap to create a full-gap block
  • Add environment variable for specifying ON DROP behavior in muxer
  • Add start_ns and end_ns properties for SeriesBlock
  • Add KafkaReader for reading data directly through Kafka
  • Introduce new StreamReader protocol for asynchronously reading streams. This can seamlessly read data from Arrow Flight RPC or Kafka through the stream endpoint

Fixed

  • Fix nanosecond conversion in time_as_ns
  • SeriesBlock: fix mask when creating a pyarrow array from a numpy masked array
  • Fix block concatenation for masked arrays
  • Fix publish command in CLI to properly use the new publisher interface

Changed

  • Allow channels to have names with only subsystems
  • Forbid None as a valid data type. This used to be previously interpreted as a float64 based on how numpy can interpret None
  • Move Flight endpoint schemas from the server to the client
  • Update publisher interface:
  • Register step is now used solely to resolve channels to be published
  • Enter method now handles retrieving all Kafka producer information
  • Remove PublisherInfo class as the publisher now has all the partition info after the registration step
  • Improve Channel repr: better formatting for int parameters, include more information
  • Specify default output format of 'str' for CLI
  • Improve logging of get_flight_info
  • Allow gaps to be filled in data streams after reaching the timeout to avoid hanging clients through the new muxer interface, BlockMuxStream

0.6.1 - 2025-11-06

Fixed

  • Fix issues with publication with partition index scheme
  • Extract partition ID as well for relevant publishing info
  • Fix validation check in partition index for registration
  • Extract partition index from metadata in Client, fixes issue where find/describe had missing partition index
  • Raise correct RuntimeError instead of unclear AttributeError when publishing without context manager

0.6.0 - 2025-10-28

Added

  • Add partition_index attribute to Channel class

Changed

  • Change allowable channel name structure: <domain>:<subsystem>[-_]<rest>
  • This allows VIRGO-like channels to be parsed correctly
  • Also expose subsystem property to Channel class
  • Update publisher to push channel partition index values instead of names, reducing packet sizes
  • Track channel name to ID values during registration and partitioning
  • SeriesBlock.from_row_batch takes a partition index - channel map instead

0.5.0 - 2025-09-26

Added

  • Allow a pre-defined schema to be passed into SeriesBlock.to_column_batch

Fixed

  • Address muxer edge cases causing stale data to not be returned
  • Fix edge case in muxer when we get complete data for a newer timestamp after incomplete data from an older timestamp

Changed

  • Improve performance of SeriesBlock generation from record batches:
  • Make fath path quicker for non-null Arrow arrays in converting to numpy arrays
  • Avoid unnecessary Arrow array type inference
  • Extract single time from batch instead of converting to numpy array first
  • Switch to more efficient bit manipulation to calculate Arrow array mask for numpy conversion
  • Improve performance of conversion from numpy masked arrays to Arrow arrays for nested types

0.4.1 - 2025-07-10

Fixed

  • Fix edge case in muxer where multiple blocks with the same time could be returned

0.4.0 - 2025-06-25

Added

  • Add support for gaps in data, represented as masked arrays in SeriesBlock
  • Add property in Series that reports gaps in data
  • Allow printing channel as JSON from CLI in arrakis describe/find
  • Add --latency option to print buffer latency to stderr
  • Add 'expected latency' metadata in Channel
  • Allow creation of gaps within SeriesBlock to support server-side gap handling

Fixed

  • Fix match parsing on drop in muxer
  • Fix issue where items in muxer when setting on_drop to 'warn' was not dropping items

0.3.0 - 2025-04-16

Added

  • Add option to specify URL in arrakis CLI
  • Add dtype alias to data_type in Channel
  • Add publish sub-command in Arrakis CLI to generate arbitrary streams to publish to the specified channels
  • Add schema validation to request descriptors for client and server-side validation

Fixed

  • Add min_rate/max_rate arguments if not specified in client, addressing a failure if specified as None
  • Fix issue in excessive CPU usage when polling MultiEndpointStream
  • Coerce data types to strings within Client so they are JSON-serializable
  • Fix describe command in arrakis CLI to properly extract channel info for display

Changed

  • Use GPSTimeParseAction for time arguments/options in arrakis CLI, allowing arbitrary date/time strings
  • Redefine eq for Channel, relaxing strict equality for optional fields
  • Update publication interface:
  • take the publisher_id at initialization, not during register
  • the register step now retrieves the channel list and updates the partition info
  • context manager now handles retrieving kafka info from the server to allow publication
  • publish method checks consistency of channels being published
  • Check channels when initializing publisher

0.2.0 - 2025-03-11

Added

  • Add publisher metadata to Channel
  • Allow multiple data types in find/count requests
  • Allow querying by publisher in find/count requests
  • Add from_json constructor in Channel
  • Add arrakis entry point

Fixed

  • Fix issue in parsing response in Publisher registration
  • Improve error handling and mitigate timeouts in MultiEndpointStream polling
  • Remove initial describe call within stream endpoint

Changed

  • Allow Channel to handle raw numpy dtypes
  • Expose domain property for Channel
  • Publisher now only requires a publisher_id for registration

Removed

0.1.0 - 2024-11-13

  • Initial release.