ANKA is a synchrotron facility located at KIT consisting of an accelerator ring, and attached experimental stations (beamlines) utilizing the light produced by the accelerator. The experimental stations provide specialized setups and broad range of applications for national and international users.

 

 ankaring

Schematic presentation of ANKA, experiment hutches in color.

 

One of the highlights is the Ultra Fast Tomography (see next pages) using the X-ray part of the light to provide high re­solution four dimensional images (a series of attenuation volumes with subsequent volumes corres­ponding to subsequent points in time). Ultra Fast Tomography is used to get new insights in dynamic biological and physical processes.

 

Some of the most important functional re­qui­rements towards data management at ANKA are:

1. Automated data management and federated data access: Users do not need to be involved into technical details of the data management and data policies.

2. Flexibility: The continuous im­provement of the experimental setups is an important part of ANKA research.

3. Large scale data: ANKA produces multiple petabytes of data yearly.

4. Long living data and metadata: Data which lead to publications needs to be archived for at least 10 years. Therefore it needs to be extended with metadata.

5. Heterogeneous data: The beamlines support many experiments and users.


A new concept to manage the data life cycle including transparent workspaces was required and is being implemented [ANKA1]. It is based on two connected data management infrastructure pro­jects at KIT. The Large Scale Data Facility (LSDF) supports and enables scientific projects by pro­viding reliable large scale storage and pro­cessing infrastructures. The KIT Data Manager (KIT DM) extends the LSDF by a data service archi­tecture with additional data and metadata services allowing federated data ingest and access.

 

 DM

Data in the experiment domain and beyond.

 

In the ANKA experiments both primary and derived data are produced. The primary data is created during measurements by the instru­mentation and the control system, para­meterized by the user. Derived data is calculated from primary data using on-site computing infrastructures.

 

To deal with the heterogeneity of the data and to manage the data based on policies, data managed in the experiment domain is organized in datasets from the very beginning. A dataset is a basic unit of managed data and consists of one or more unique identifiers, data and metadata. Metadata is captured automatically from the instru­mentation and control system, if possible, and needs to be extended by the user. A variety of tools supports the administration of the experiment data life cycle. In the tomography beamlines the standardized data format NeXuS as well as proprietary formats are used allowing the application of proprietary analysis tools.

 

The results of a successful experiment are archived in the LSDF using the KIT DM. External users can access and process their experiment data remotely.


[ANKA1]     Stotzka, R.; Mexner, W.; dos Santos Rolo, T.; Pasic, H.; van Wezel, J.; Hartmann, V.; Jejkal, T.; Garcia, A.; Haas, D.; Streit, A., Large Scale Data Facility for Data Intensive Synchrotron Beamlines, Proceedings of 13th International Conference on Accelarator and Large Experimental Physics Control Systems (ICALEPS 2011), 2011, pp. 1216-1219


Contact:

KIT, IPE: Halil Pasic, Rainer Stotzka, Thomas Jejkal, Xiaoli Yang

KIT, ANKA: Wolfgang Mexner, David Haas

Copyright by SWM, KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft
Templates Joomla 1.7 by Wordpress themes free