Issues tracker

Submit issue and bug reports here. For feature requests and enhancements, use our Roadmap instead. For other issues, or if you prefer to reach us privately, contact support.

Trending
  1. Incorrect display factors for GLBX.MDP3 on 2012-02-05 to 2012-02-11

    The GLBX.MDP3 data from 2012-02-05 to 2012-02-11 is displaying prices utilizing the wrong display factors. This leads to prices being off by a factor varying based on the instrument group.

    Renan Gemignani
    #data-quality#bug

    0

  2. Latency tails and performance issues

    There are a few latency hotspots that we're working on. In general, the median latency is acceptable but we'd like to confine the tails. BBO-1x and OHLCV-1x There have been reports of latency spikes around our subsampled schemas, especially BBO-1x which would be used for a more latency-sensitive use case than OHLCV-1x and cannot be replicated from trades, sometimes stretching to a few seconds. These may be due to the memory bandwidth burst when publishing 600k to 1.2M instruments on the dot at the same time. Specific times of day Shared by one of our users, it appears that ts_out - ts_recv spikes to 1-3 seconds on certain days, e.g. attached screenshot for Sep 18, during Fed rate cut announcement. Mass cancels After stripping out a live gateway, we find that we're keeping to 5.1~6.5 us median, but 300-800 us 99.9th percentile. This appears to be due to latency spikes around mass cancels. dataset=GlbxMdp3 quantiles=[(0.0, 1565), (0.1, 2787), (0.25, 3597), (0.5, 5739), (0.9, 87679), (0.95, 214527), (0.99, 428031), (0.995..., (0.999, 658943)] dataset=GlbxMdp3 quantiles=[(0.0, 1615), (0.1, 2447), (0.25, 3187), (0.5, 5143), (0.9, 30079), (0.95, 68479), (0.99, 173567), (0.995, ..., (0.999, 312831)]

    Tessa Hollinger
    #performance

    0

  3. GLBX.MDP3 Status schema unavilabile for extended MDP2 history

    The Status schema is unavailable for extended history in the GLBX.MDP3 dataset. Data prior to 2017-05-21 will be missing. This is due to protocol difference between MDP3 and MDP2. This data is being regenerated, and will be made available once the regeneration and validation are completed. This is tracked internally as D-3931.

    Nicholas James Macholl
    #data-quality#bug

    0

  4. Incorrect strike prices for some options on CME MDP2 data

    Strike prices for multiple options in GLBX.MDP3 data prior to 2017-05-21 are incorrect. Only definitions data is affected.

    Renan Gemignani
    #data-quality#bug

    0

  5. Extend Nasdaq coverage to include last hour of extended trading, when it crosses into next UTC day

    During winter months, when Eastern Time is UTC-4, the last hour of Nasdaq extended trading session (7 to 8pm ET) extends into the following UTC day. This is assigned D-2127 internally.

    Jack C

    1

  6. Missing `ts_ref` in CME MDP2 after 2015-11-20

    Statistics after 2015-11-20 appears to always be missing ts_ref. This is tracked internally as D-3941.

    Carter Green
    #bug

    0

  7. Instrument definition issues on GLBX.MDP3 from 2012-02-05 to 2012-02-10

    Instrument definitions have misaligned or incorrect values for many (about 1%) of instruments from 2012-02-05 to 2012-02-10. This is an artifact of incomplete data from our backfill source (CME DataMine)β€”we had to infer and reconstruct the fields as best as we could from the following week's data.

    Tessa Hollinger
    #data-quality

    0

  8. GLBX.MDP3 empty data for 2010-09-13, 2010-11-17, 2012-07-19, and 2017-03-13

    Legacy MDP 2 data is empty on 2010-09-13, 2010-11-17, 2012-07-19, and 2017-03-13. These were regular trading days and should have data. This is tracked internally as D-3994

    Carter Green
    #data-quality#data-gap

    0

  9. Missing SettlPriceType normalization for MDP2 CME data

    CME data prior to May 2017 is missing normalization of SettlPriceType to the stat_flags field. This is tracked internally as D-3940.

    Carter Green
    #bug

    0

  10. Saturday test data shows up on CME live data

    Currently our historical data discards Saturday test data on CME, however our live data gateways expose the test data because it’s useful for Databento’s own internal monitoring to ensure that our live gateways are running properly. We plan to discard this test data in live as well to ensure point-in-time behavior. We’ll likely have to hide test data from customers. This is assigned D-2036 internally and will likely only be addressed in Q3.

    Tessa Hollinger
    #bug#data-quality

    0

  11. Leg information is not provided for composite instruments in CME and ICE

    Currently, the InstrumentDefMsg provided by our definition schema does not contain the underlying leg information. Adding this is on our roadmap here. This is assigned internal ticket D-2032.

    Renan Gemignani
    #symbology

    0

  12. MDOrderPriority is not provided for CME Globex MDP 3.0

    We currently leave out MDOrderPriority from our normalized MBO messages, due to 2 reasons: This behavior is CME-specific. Most other venues adopt ITCH-based behavior, where ordering is based on timestamp instead. If we supported MDOrderPriority, it would bloat the data for all of the non-CME venues. ts_event serves the same purpose as MDOrderPriority for all instruments except for interest rate options and instruments where there's a LMM. We confirmed this with CME GCC and also from practical experience of comparing simulation against live order matching. We actually recommend most users to infer MDOrderPriority from ts_event since it makes their order book implementation more reusable for other venues. Two proposed solutions were considered: Exposing raw message payloads and PCAPs. 'Supplementing' our normalized MBO data with venue-specific fields. We consider approach 2 to be inferior: If we tried incorporating other fields, we'd be replicating the venue's original structure in DBN with unnecessary indirection and copies and losing the benefit of normalization β€” so we might as well expose the raw message payload. For many venues, the raw data has nested message tree structures, and if we replicated it, either in Databento Binary Encoding or an unstructured encoding like JSON, we'd lose a lot of the benefits of our zero-copy binary encoding. The only limitation of approach 1 is that it's bandwidth-intensive, obviously as most venues require you to receive their raw multicast data on their extranet over a physical cross-connect. At this time, we reject approach 2 and are working towards approach 1. This is tracked on our roadmap here. This is tracked internally as D-2130.

    Tessa Hollinger
    #http-api πŸ”—#raw-api πŸ”—

    1

  13. Records with unordered `ts_recv` in pre-2017 GLBX.MDP3 data

    We've identified a few cases where there are records with unordered ts_recv in GLBX.MDP3 data before 2017-05-21. We typically guarantee that ts_recv is non-decreasing. If records are received out of order, we adjust ts_recv to be non-decreasing and set the F_BAD_TS_RECV flag. An example case is on 2010-06-28 for instrument_id 503465: {"ts_recv":"2010-06-28T14:23:29.956000000Z","hd":{"ts_event":"2010-06-28T14:23:29.956000000Z","rtype":24,"publisher_id":1,"instrument_id":503465},"ts_ref":"2010-06-28T00:00:00.000000000Z","price":"-19.750000000","quantity":0,"sequence":322976,"ts_in_delta":0,"stat_type":1,"channel_id":0,"update_action":1,"stat_flags":0,"symbol":null} {"ts_recv":"2010-06-28T14:23:29.956000000Z","hd":{"ts_event":"2010-06-28T14:23:29.956000000Z","rtype":24,"publisher_id":1,"instrument_id":503465},"ts_ref":"2010-06-28T00:00:00.000000000Z","price":"-19.750000000","quantity":0,"sequence":322976,"ts_in_delta":0,"stat_type":1,"channel_id":0,"update_action":1,"stat_flags":0,"symbol":null} {"ts_recv":"2010-06-28T14:23:25.646000000Z","hd":{"ts_event":"2010-06-28T14:23:25.646000000Z","rtype":24,"publisher_id":1,"instrument_id":503465},"ts_ref":"2010-06-28T00:00:00.000000000Z","price":"-19.750000000","quantity":0,"sequence":322977,"ts_in_delta":0,"stat_type":1,"channel_id":0,"update_action":1,"stat_flags":0,"symbol":null} {"ts_recv":"2010-06-28T14:23:25.646000000Z","hd":{"ts_event":"2010-06-28T14:23:25.646000000Z","rtype":24,"publisher_id":1,"instrument_id":503465},"ts_ref":"2010-06-28T00:00:00.000000000Z","price":"-19.750000000","quantity":0,"sequence":322977,"ts_in_delta":0,"stat_type":1,"channel_id":0,"update_action":1,"stat_flags":0,"symbol":null} ts_recv went from 2010-06-28T14:23:29.956000000Z to 2010-06-28T14:23:25.646000000Z (backwards by 4 seconds), even though sequence shows that the ordering of the records is correct. These records are also both duplicated, which is another error. Internal tracking: D-4612

    Zach Banks
    #data-quality

    0

  14. GLBX.MDP3 OHLCV-1s delayed publishing from Live API

    We're investigating OHLCV-1s bars being sent late from the Live API in situations where no trades occur in after the bar has closed. This results in bars being published about a second late in such cases. Internal tracking ID: D-4457

    Nicholas James Macholl

    0

  15. Latency dashboard displaying spurious 2 ns samples

    Our public latency dashboard is up and in experimental status. It appears that there's still a bug affecting the ingestion and clock sync on the client side which causes them to be artificially placed at 2 ns in the end-to-end latency distribution.

    Tessa Hollinger
    #bug#portal πŸ–₯

    0