Issues tracker

Submit issue and bug reports here. For feature requests and enhancements, use our Roadmap instead. For other issues, or if you prefer to reach us privately, contact support.

Trending
  1. Extend Nasdaq coverage to include last hour of extended trading, when it crosses into next UTC day

    During winter months, when Eastern Time is UTC-4, the last hour of Nasdaq extended trading session (7 to 8pm ET) extends into the following UTC day. This is assigned D-2127 internally.

    Jack C

    0

  2. Saturday test data shows up on CME live data

    Currently our historical data discards Saturday test data on CME, however our live data gateways expose the test data because itโ€™s useful for Databentoโ€™s own internal monitoring to ensure that our live gateways are running properly. We plan to discard this test data in live as well to ensure point-in-time behavior. Weโ€™ll likely have to hide test data from customers. This is assigned D-2036 internally and will likely only be addressed in Q3.

    Tessa Hollinger
    #bug#data-quality

    0

  3. MDOrderPriority is not provided for CME Globex MDP 3.0

    We currently leave out MDOrderPriority from our normalized MBO messages, due to 2 reasons: This behavior is CME-specific. Most other venues adopt ITCH-based behavior, where ordering is based on timestamp instead. If we supported MDOrderPriority, it would bloat the data for all of the non-CME venues. ts_event serves the same purpose as MDOrderPriority for all instruments except for interest rate options and instruments where there's a LMM. We confirmed this with CME GCC and also from practical experience of comparing simulation against live order matching. We actually recommend most users to infer MDOrderPriority from ts_event since it makes their order book implementation more reusable for other venues. Two proposed solutions were considered: Exposing raw message payloads and PCAPs. 'Supplementing' our normalized MBO data with venue-specific fields. We consider approach 2 to be inferior: If we tried incorporating other fields, we'd be replicating the venue's original structure in DBN with unnecessary indirection and copies and losing the benefit of normalization โ€” so we might as well expose the raw message payload. For many venues, the raw data has nested message tree structures, and if we replicated it, either in Databento Binary Encoding or an unstructured encoding like JSON, we'd lose a lot of the benefits of our zero-copy binary encoding. The only limitation of approach 1 is that it's bandwidth-intensive, obviously as most venues require you to receive their raw multicast data on their extranet over a physical cross-connect. At this time, we reject approach 2 and are working towards approach 1. This is tracked on our roadmap here. This is tracked internally as D-2130.

    Tessa Hollinger
    #http-api ๐Ÿ”—#raw-api ๐Ÿ”—

    1

  4. Historical GLBX.MDP3 data stale (>24 hours)

    This is typically available latest 11:00 UTC+0 the next day. It seems the dataset has not been appended to since the 29/04 session. Reproduce: curl -G 'https://hist.databento.com/v0/metadata.get_dataset_range' -u <API KEY>: -d dataset=GLBX.MDP3; {"start_date":"2017-05-21","end_date":"2024-04-29"} For an individual instrument: curl -X POST 'https://hist.databento.com/v0/timeseries.get_range' -u <API KEY>: -d dataset=GLBX.MDP3 -d symbols=ESM4 -d schema=ohlcv-1d -d start='2024-04-29' -d end='2024-05-01' -d encoding=json -d pretty_px=true -d pretty_ts=true -d map_symbols=true -d limit=1 | jq { "detail": { "case": "data_end_after_available_end", "message": "The dataset GLBX.MDP3 has data available up to '2024-04-30 00:00:00+00:00'. The end in the query ('2024-05-01 00:00:00+00:00') is after the available range. Try requesting with an earlier end.", "status_code": 422, "docs": "https://databento.com/docs/api-reference-historical/basics/datasets", "payload": { "dataset": "GLBX.MDP3", "start": "2024-04-29T00:00:00.000000000Z", "end": "2024-05-01T00:00:00.000000000Z", "available_start": "2017-05-21T00:00:00.000000000Z", "available_end": "2024-04-30T00:00:00.000000000Z" } } }

    Jason H

    2

  5. TBBO files showing bid_px_00 prices > ask_px_00

    I am seeing across all files in my TBBO batch, across a range of futures markets, an average of about 62 entries where the above condition applies... here is an example of a single row. ts_event 2024-02-18 23:00:00+00:00 rtype 1 publisher_id 1 instrument_id 17077 action T side N depth 0 price 5016.5 size 44 flags 0 ts_in_delta 18276 sequence 3879 bid_px_00 5021.5 ask_px_00 5008.0 bid_sz_00 3 ask_sz_00 20 bid_ct_00 2 ask_ct_00 1 symbol ESH4 Name: 2024-02-18 23:00:00.030977523+00:00, dtype: object Is this expected behavior?

    JWaldron
    #data-quality#invalid

    0

  6. Incorrect scaling of strike price in definition schema for some CME options

    Gold (OG.OPT) option strike prices are 100x too small and silver option (SO.OPT) strike prices are 10x too small. Corresponds with issue D-2171. Impacted the following options products (and respective security groups): Aluminum (AX): groups A7 and A8 Copper (HXE, H[1-5[MTWRE]): groups 1U and 2U Gold (OG, OG[1-5], G[1-5][MTWR]): groups OG and 1Y Micro gold (OMG, [1-5][MWF]G): groups OM and OQ Palladium (PAO): groups P3 and P4 Platinum (PO): P0 and P1 Silver (SO, SO[1-5], [MTWR][1-5]S): groups SO and S1

    Carter Green
    #data-quality#bug

    3

  7. Fututes series not not working?

    Trying to retrieve the following data set : params = dict( symbols='SR3.c.0', dataset='GLBX.MDP3', schema='ohlcv-1d', stype_in='continuous', start= '2017-06-01', end='2024-02-29', ) receiving the following error: databento.common.error.BentoError: Error streaming response: Response ended prematurely

    James
    #raw-api ๐Ÿ”—#python ๐Ÿ”—#performance

    0

  8. Using standard Databento credentials for FTP subjects those credentials to sniffing

    This is related to: https://roadmap.databento.com/b/n0o5prm6/feature-ideas/support-more-secure-ftp This is assigned D-2126 internally and we plan to addressing this in Q2 2024.

    Tessa Hollinger
    #security

    0

  9. MBP-1/10 `side` field is only filled in for Trades

    Our MBP-1/MBP-10 schemas currently only provide side information on trades (to indicate the aggressing side).

    Renan G

    8

  10. CME Statistics: sequence field uses RptSeq instead of MsgSeqNum

    In the GLBX.MDP3 dataset (CME), the statistics schema uses the incorrect field from CME to populate the sequence number. This is assigned D-1651 internally

    Zach Banks
    #bug#data-quality

    1

  11. Leg information is not provided for composite instruments in CME and ICE

    Currently, the InstrumentDefMsg provided by our definition schema does not contain the underlying leg information. This is assigned internal ticket D-2032.

    Renan Gemignani
    #symbology

    0

  12. Unable to include previous day (Saturday) in API and frontend request on Sunday

    Usually, you'd be able to request data from T-1 as soon as it becomes available. It appears that on Sunday (e.g. 11/19), T-1 is Saturday (11/18), and currently our API and frontend prevents you from getting 11/18 until 11/19 is released, therefore causing an artificial wait time of an extra day before you can include 11/18 in your API or frontend request. This is purely an ergonomic issue since Saturday would have no data, but it does create some pain in the integration experience.

    Tessa Hollinger
    #bug#http-api ๐Ÿ”—#portal ๐Ÿ–ฅ

    1

  13. DBN files from split batch jobs have symbology for the whole time range

    If a batch job has a "split duration" set, then each DBN file only covers a subset of dates of the original job. (This holds true regardless of "split symbols" or "split size" settings.) Currently, the symbology metadata included in each DBN file from the batch job still covers the entire batch job's time range, not the subset of time that the DBN file is for (e.g. the 1 day or 1 week). With "split by date," each entry in the symbology metadata in the DBN file should only have 1 interval which should cover exactly 1 day. There should be no overlap of instrument_ids. /timeseries.get_range is not impacted by this, since it doesn't have the "split duration" concept.

    Zach Banks
    #bug#symbology

    0

  14. Nasdaq TotalView ITCH data issues

    We have degraded data on: 2020-06-22 2021-07-07 2021-10-26 2022-09-19 We'll be backfilling these. These issues are also reported via the metadata.get_dataset_condition endpoint. This is assigned D-456 internally.

    Tessa Hollinger
    #data-quality#data-gap#http-api ๐Ÿ”—

    0

  15. Degraded data on GLBX.MDP3 for 2020-02-27, 2020-02-28, 2020-06-30, 2020-07-01

    Gapsย have been reported at the start of 2020-07-01, but underlying the issue is because of the captures for the previous day (2020-06-30), which stopped around 16:30Z (missing ~8 hrs).

    Tessa Hollinger
    #data-quality#data-gap

    4