XeDocs - XENON Metadata management tool¶

XeDocs manages tracking versioned detector numbers, replacing CMT and ideally all hard-coded values. XeDocs both looks up data from its own online database, and uses straxen URL-style lookup to find other resources. To upload data to the XeDocs database, you must submit it as a PR to https://github.com/XENONnT/corrections

What does Xedocs give you¶

Data reading¶

Read data from multiple formats (e.g. mongodb, pandas) and locations with a simple unified interface.
Custom logic implemented on the document class, e.g. creating a tensorflow model from the data etc.
Multiple APIs for reading data, fun functional, ODM style, pandas and xarray.
Read data as objects, dataframes, dicts, json.

Writing data¶

Write data to multiple storage backends with the same interface
Custom per-collection rules for data insertion, deletion and updating.
Schema validation and type coercion so storage has uniform and consistent data.

Other¶

Custom panel widgets for graphical representation of data, web client
Auto-generated API server and client + openapi documentation
CLI for viewing and downloading data

Basic Usage¶

Explore the available schemas

import xedocs

>>> xedocs.list_schemas()
>>> ['detector_numbers',
    'fax_configs',
    'plugin_lineages',
    'context_lineages',
    'pmt_area_to_pes',
    'global_versions',
    'electron_drift_velocities',
    ...]

>>> xedocs.help('pmt_area_to_pes')

>>>
        Schema name: pmt_area_to_pes
        Index fields: ['version', 'time', 'detector', 'pmt']
        Column fields: ['created_date', 'comments', 'value']

Read/write data from the shared development database, this database is writable from the default analysis username/password

import xedocs

db = xedocs.development_db()

docs = db.pmt_area_to_pes.find_docs(version='v1', pmt=[1,2,3,5], time='2021-01-01T00:00:00', detector='tpc')
to_pes = [doc.value for doc in docs]

# passing a run_id will attempt to fetch the center time of that run from the runs db
doc = db.pmt_area_to_pes.find_one(version='v1', pmt=1, run_id=25000, detector='tpc')
to_pe = doc.value

Read from the straxen processing database, this database is read-only for the default analysis username/password

import xedocs

db = xedocs.straxen_db()

...

Read from the the corrections gitub repository, this database is read-only

import xedocs

db = xedocs.corrections_repo(branch="master")

...

If you cloned the corrections gitub repo to a local folder, this database can be read too

import xedocs

db = xedocs.local_folder(PATH_TO_REPO_FOLDER)

...

Read data from alternative data sources specified by path, e.g csv files which will be loaded by pandas.

from xedocs.schemas import DetectorNumber

g1_doc = DetectorNumber.find_one(datasource='/path/to/file.csv', version='v1', field='g1')
g1_value = g1_doc.value
g1_error = g1_doc.uncertainty

The path can also be a github URL or any other URL supported by fsspec.

from xedocs.schemas import DetectorNumber

g1_doc = DetectorNumber.find_one(
                         datasource='github://org:repo@/path/to/file.csv',
                         version='v1',
                         field='g1')

Supported data sources

MongoDB collections

TinyDB tables

JSON files

REST API clients

Please open an issue on rframe if you want support for an additional data format.

If you want a new datasource to be available from a schema class, you can register it to the class:

from xedocs.schemas import DetectorNumber

DetectorNumber.register_datasource('github://org:repo@/path/to/file.csv', name='github_repo')

# The source will now be available under the given name:

g1_doc = DetectorNumber.github_repo.find_one(version='v1', field='g1')

Documentation¶

Full documentation hosted by Readthedocs

Credits¶

This package was created with Cookiecutter and the briggySmalls/cookiecutter-pypackage project template.