Submit issue and bug reports here. For feature requests and enhancements, use our Roadmap instead. For other issues, or if you prefer to reach us privately, contact support.
During winter months, when Eastern Time is UTC-4, the last hour of Nasdaq extended trading session (7 to 8pm ET) extends into the following UTC day. This is assigned D-2127 internally.
0
Currently our historical data discards Saturday test data on CME, however our live data gateways expose the test data because itโs useful for Databentoโs own internal monitoring to ensure that our live gateways are running properly. We plan to discard this test data in live as well to ensure point-in-time behavior. Weโll likely have to hide test data from customers. This is assigned D-2036 internally and will likely only be addressed in Q3.
0
We currently leave out MDOrderPriority from our normalized MBO messages, due to 2 reasons: This behavior is CME-specific. Most other venues adopt ITCH-based behavior, where ordering is based on timestamp instead. If we supported MDOrderPriority, it would bloat the data for all of the non-CME venues. ts_event serves the same purpose as MDOrderPriority for all instruments except for interest rate options and instruments where there's a LMM. We confirmed this with CME GCC and also from practical experience of comparing simulation against live order matching. We actually recommend most users to infer MDOrderPriority from ts_event since it makes their order book implementation more reusable for other venues. Two proposed solutions were considered: Exposing raw message payloads and PCAPs. 'Supplementing' our normalized MBO data with venue-specific fields. We consider approach 2 to be inferior: If we tried incorporating other fields, we'd be replicating the venue's original structure in DBN with unnecessary indirection and copies and losing the benefit of normalization โ so we might as well expose the raw message payload. For many venues, the raw data has nested message tree structures, and if we replicated it, either in Databento Binary Encoding or an unstructured encoding like JSON, we'd lose a lot of the benefits of our zero-copy binary encoding. The only limitation of approach 1 is that it's bandwidth-intensive, obviously as most venues require you to receive their raw multicast data on their extranet over a physical cross-connect. At this time, we reject approach 2 and are working towards approach 1. This is tracked on our roadmap here. This is tracked internally as D-2130.
1
This is typically available latest 11:00 UTC+0 the next day. It seems the dataset has not been appended to since the 29/04 session. Reproduce: curl -G 'https://hist.databento.com/v0/metadata.get_dataset_range' -u <API KEY>: -d dataset=GLBX.MDP3; {"start_date":"2017-05-21","end_date":"2024-04-29"} For an individual instrument: curl -X POST 'https://hist.databento.com/v0/timeseries.get_range' -u <API KEY>: -d dataset=GLBX.MDP3 -d symbols=ESM4 -d schema=ohlcv-1d -d start='2024-04-29' -d end='2024-05-01' -d encoding=json -d pretty_px=true -d pretty_ts=true -d map_symbols=true -d limit=1 | jq { "detail": { "case": "data_end_after_available_end", "message": "The dataset GLBX.MDP3 has data available up to '2024-04-30 00:00:00+00:00'. The end in the query ('2024-05-01 00:00:00+00:00') is after the available range. Try requesting with an earlier end.", "status_code": 422, "docs": "https://databento.com/docs/api-reference-historical/basics/datasets", "payload": { "dataset": "GLBX.MDP3", "start": "2024-04-29T00:00:00.000000000Z", "end": "2024-05-01T00:00:00.000000000Z", "available_start": "2017-05-21T00:00:00.000000000Z", "available_end": "2024-04-30T00:00:00.000000000Z" } } }
2
I am seeing across all files in my TBBO batch, across a range of futures markets, an average of about 62 entries where the above condition applies... here is an example of a single row. ts_event 2024-02-18 23:00:00+00:00 rtype 1 publisher_id 1 instrument_id 17077 action T side N depth 0 price 5016.5 size 44 flags 0 ts_in_delta 18276 sequence 3879 bid_px_00 5021.5 ask_px_00 5008.0 bid_sz_00 3 ask_sz_00 20 bid_ct_00 2 ask_ct_00 1 symbol ESH4 Name: 2024-02-18 23:00:00.030977523+00:00, dtype: object Is this expected behavior?
0
Gold (OG.OPT) option strike prices are 100x too small and silver option (SO.OPT) strike prices are 10x too small. Corresponds with issue D-2171. Impacted the following options products (and respective security groups): Aluminum (AX): groups A7 and A8 Copper (HXE, H[1-5[MTWRE]): groups 1U and 2U Gold (OG, OG[1-5], G[1-5][MTWR]): groups OG and 1Y Micro gold (OMG, [1-5][MWF]G): groups OM and OQ Palladium (PAO): groups P3 and P4 Platinum (PO): P0 and P1 Silver (SO, SO[1-5], [MTWR][1-5]S): groups SO and S1
3
Trying to retrieve the following data set : params = dict( symbols='SR3.c.0', dataset='GLBX.MDP3', schema='ohlcv-1d', stype_in='continuous', start= '2017-06-01', end='2024-02-29', ) receiving the following error: databento.common.error.BentoError: Error streaming response: Response ended prematurely
0
This is related to: https://roadmap.databento.com/b/n0o5prm6/feature-ideas/support-more-secure-ftp This is assigned D-2126 internally and we plan to addressing this in Q2 2024.
0
Our MBP-1/MBP-10 schemas currently only provide side information on trades (to indicate the aggressing side).
8
In the GLBX.MDP3 dataset (CME), the statistics schema uses the incorrect field from CME to populate the sequence number. This is assigned D-1651 internally
1
Currently, the InstrumentDefMsg provided by our definition schema does not contain the underlying leg information. This is assigned internal ticket D-2032.
0
Usually, you'd be able to request data from T-1 as soon as it becomes available. It appears that on Sunday (e.g. 11/19), T-1 is Saturday (11/18), and currently our API and frontend prevents you from getting 11/18 until 11/19 is released, therefore causing an artificial wait time of an extra day before you can include 11/18 in your API or frontend request. This is purely an ergonomic issue since Saturday would have no data, but it does create some pain in the integration experience.
1
If a batch job has a "split duration" set, then each DBN file only covers a subset of dates of the original job. (This holds true regardless of "split symbols" or "split size" settings.) Currently, the symbology metadata included in each DBN file from the batch job still covers the entire batch job's time range, not the subset of time that the DBN file is for (e.g. the 1 day or 1 week). With "split by date," each entry in the symbology metadata in the DBN file should only have 1 interval which should cover exactly 1 day. There should be no overlap of instrument_ids. /timeseries.get_range is not impacted by this, since it doesn't have the "split duration" concept.
0
We have degraded data on: 2020-06-22 2021-07-07 2021-10-26 2022-09-19 We'll be backfilling these. These issues are also reported via the metadata.get_dataset_condition endpoint. This is assigned D-456 internally.
0
Gapsย have been reported at the start of 2020-07-01, but underlying the issue is because of the captures for the previous day (2020-06-30), which stopped around 16:30Z (missing ~8 hrs).
4