(In part this is a summary of a discussion on July 28th with the
Present: D. Michael, A .Grillo, E. Scapparone, S. Parlati, A. D'Ambrosio
In part, this is a summary of the discussion held on July 28th at the Gran
Sasso (D. Michael, A. Grillo, E. Scapparone, S. Parlati, A. D'Ambrosio).
In addition, I have added some other material and my own comments relevant
to the U.S. institutions. At the end, I pose questions to the U.S. institutions
which will help establish what actions we may take.
Abstract: Our current data load is approximately 4-5 times bigger than in past
MACRO running. Handling this much data will require some new investment in
both hardware and manpower. At the moment, sufficient hardware capacity exists
at the Gran Sasso to handle this data rate but not to make a complete
distribution of all of this data to MACRO institutions. DLT tapes are being
implemented as a solution at the Gran Sasso and for some Italian institutions.
Technical parameters for DLT tapes are presented. Options for the U.S.
institutions are discussed and it appears that DLT's (perhaps combined
with other approaches) may also provide an attractive solution here as well.
Since data is already building, we need to try to converge to some new solution
for distribution as soon as possible.
A Brief History of Tape Production for U.S. Institutions
--------------------------------------------------------
In order to have some perspective on our future plans, I give here a brief
synopsis of the tape production for U.S. Institutions.
1. Back in the SM1 running days, tape distribution to U.S. institutions was
accomplished by sending one copy of an origial tape to Caltech. Graduate
students at Caltech then produced copies of tapes for all of the U.S.
institutions. This task was already a considerable job for the students before
the quantity of data started to increase. Caltech did not have sufficient
infrastructure to deal with the tapes and the job was frankly inappropriate
to expect the Caltech students to do for the entire collaboration. The system
completely crumbled as soon as the data quantity increased from just SM1.
2. We implemented a new system based on production of copies of exabyte tapes
at the Gran Sasso. Each institution was expected to provide its own tapes. The
production mechanism for these tapes was originally setup by Scott Nutter and
Sandra Parlati with software written by Sandra Parlati, partly based on
codes written by P. Campana and A. Baldini.
Copies of tapes were produced from zebra files living on disks
on the MACRO uVAXes on the vaxgs cluster. The system relies on a single 10-tape
stacker exabyte drive. Ersilia Giusti handled the day-to-day operations of
labelling tapes, inserting them into the drive, starting the copy program,
checking for any error messages and distributing the final tapes to the U.S.
institutions. If *anything* went wrong, Ersilia relied on Sandra Parlati to
understand the problem and get things back on track. Sandra was the person
responsible for checking and maintaining the automatic procedures for
copying of files from the acquisition computers, running
of the jobs to produce zebra files and DST's, getting the initial TA90 tape
copies (the archive medium) and generally keeping everything on track. Last
December, this system came crashing to a halt when Sandra Parlati was no longer
employed to do this job and increasing demands on Ersilia effectively reduced
her effort on this task to zero. I think everyone in the U.S. is aware of how
bad the situation has been for the last 9 months... effectively, we have had
no system. We have now completely lost the services of Ersilia (the laboratory
has decided that it cannot spare that time for MACRO). For the last several
months, a minimal tape copying effort has been in place (by A. Grillo) which
produces a single exabyte tape for the U.S. institutions as well as one for
Italian institutions.
This old system was relatively fragile for several reasons. First, the
exabyte tapes don't always write properly, making it necessary to repeat
copies on some tapes. This greatly increased the size of the job to be done.
Requiring that all institutions provided high-quality tapes was a big help
for that. However, because the system is relatively slow, any breakdowns in
the stacking drive would cause the copying to fall behind and it was very
difficult to catch up. In addition, any breakdown anywhere in the production
chain would result in a data backup that was always difficult to push through
the tape bottleneck. With the appropriate personnel, the system did work but
its capacity was never very high.
The Current Situation for Data Production and Tape Copying
----------------------------------------------------------
With the installation of the new WFD system, the quantity of data being
produced by MACRO has increased by roughly a factor of 4. Without any dramatic
changes, the quantity of data which we must handle now is about 10 GB/week.
This data started coming into what was already a badly broken system (at least
for the U.S. groups) and created a *bit* of commotion. Fortunately,
collaborators have been rallying to the cause and working towards a new system
which is capable of handling the larger quantity of data. The task is not yet
completed and we have important decisions in front of us as to how we will
ultimately handle this new situation. There are implications both in hardware
and personnel which must be considered. However, it was agreed by all that
the existing hardware and personnel *will* allow us to begin data acquisition
and storage with the WFD data under present (or nearly present) circumstances.
Current Hardware
----------------
The following is the list of hardware and steps in the current production
system, given in the order that the data flows. Things which are listed as
***planned*** are definitely in the works but are not yet actually in place.
Technical parameters of DLT tapes and drives are described below for those
(like myself) not already familiar with this technology. Note that the system
described here is brand new, not what has actually been happening for any
period of time.
1. 16 GB disk space on the data acquisition computers. (Before last week, there
was only 8 GB.) Hence, this is good for about 10 days of data.
2. ***planned*** DLT stacking drive on vxmacb so that backup copies of raw
data files can be made in the tunnel. This should eliminate loss of data
as has occurred a couple of times in the past due to mistakes in the
copy and production.
3. An optical link system between the computers in the tunnel and external
laboratory. There is a ***plan*** to get backup drivers for this system
since if they fail it results in data backups from which it is hard to
recover. This is no different than having spare electronics modules or a
backup computer system. Data is copied from the tunnel to the external
laboratory as soon as space and computer resources permit.
4. In the external lab, there are two VAX workstations with 18 GB of disk
capacity dedicated to MACRO data production. Here, the raw data files are
converted to zebra files. Since the new data is dominated by WFD data, there
is only a small expansion in the size of the files due to this process
(a couple of percent). The data are stored on disk until a sufficient
amount is collected to write tapes. The processing time for the data at
this step is about .5 (clock) days per 10 GB (one week) of data.
Hence, considerable additional processing could be implemented at this
point with no new hardware investment.
5. Once 8 GB (including WFD data) of data is amassed, it is (currently) copied
onto both DLT tapes (1 tape x 4 copies) and exabyte tapes
(4 tapes x 4 copies). The quantity of data to copy is calculated to fill
each of the tapes to about 80% of capacity.
The DLT copies are currently made by writing data over ethernet to a single
7-tape stacking DLT drive installed on vaxgsN. The rate of data transfer
ex to the DLT tapes in this configuration is 25 minutes/GB, or 3.5 hours per
tape (containing about 5.5 days of data). In the future, it is
***planned*** (perhaps 2 months from now)
that a 7-tape stacking DLT drive, dedicated to MACRO, will be installed
directly onto the MACRO vaxstation in which case it is expected that the
rate of data writing will be 7 minutes/GB or 56 minutes per tape. In this
case, the existing 7-tape stacker will provide a backup. Furthermore, it is
***planned*** (6+ months from now) that a DLT robot with 3-6 drives and
200-400 tape capacity will be installed at LNGS which will be generally
available for LNGS users. It is ***planned*** that a reference version of
MACRO DLT tapes will be kept available here at all times, loadable by users
located anywhere. It is also ***planned*** to purchase two "table-top"
single DLT drives for general use. The 4 tapes which are
currently being produced are a back-up copy, an archive copy which will be
kept available in the robot, and two copies for Italian institutions who
want the full WFD data.
The exabyte copies are produced by writing onto the single 10-tape stacking
exabyte drive which is directly attached via SCSI to one of MACRO's vax
workstations. In order to make 4 copies of each tape, two separate tape
loads must be executed. It takes about 3 hours per exabyte
tape or 24 hours for each 8-tape (2 copy) job which means (on average,
considering weekends, etc) about 3 days for each production of 4 copies for
every 5 days of data if everything goes well. This has not traditionally
been the case and therefore we agreed that we should try and gain some
experience with this in the next month. Making more copies than this will
almost certainly require new hardware investment (eg an additional
stacking-drive and SCSI extender) at the Gran Sasso.
6. In addition to the data written to tape, muon DST files are produced and
stored on disk at the Gran Sasso. Many Italians use only these files for
their analyses and therefore do not depend on tape distribution. We do not
know how long this will continue to be the case. It seems clear that some
tape distribution of DST files of this sort will eventually be implemented.
At the moment, no organized "data split" exists (although the current
DST production is a clear example of such a split) which can supply the
general needs of all collaborators. We may wish to consider implementing an
organized split with appropriate distribution mechanisms.
Robustness of the Data Production Chain
---------------------------------------
MACRO has suffered several times in the past due to breakdowns in the data
production chain. In order to alleviate this problem in the future, all
hardware components in the chain will have backup systems available so that we
do not lose capabilities for very long due to hardware failure. With a larger
amount of data being produced, breakdowns in the chain will become increasingly
dangerous and painful. Probably by the end of this year, all backup systems
will be available.
Current Personnel Involved with Data Production
-----------------------------------------------
The data production is primarily being handled by the LNGS MACRO group. The
people currently involved are:
Aurelio Grillo: Boss/planning/administration
Eugenio Scapparone: planning/oversight/handling mechanics of production
Sandra Parlati: (Currently hired for one year with 3/4 Italian salary and
1/4 U.S. salary.)
Implementation of a new system/oversight/planning/
testing of hardware/handling mechanics of production
Assistants: Two students are currently assisting with the mechanics of
production. It is ***planned*** that they will be replaced with
a temporary 50% data assistant hired by the laboratory within
a couple of months.
Technical Features of DLT Drives and Tapes
------------------------------------------
"DLT tapes", said Chris Walter.
"What?", said Doug Michael.
"What?", said Charlie Peck.
So went the conversation in mid-July when Chris returned to Caltech from the
Gran Sasso and told us about the "new plan" to deal with data. Now, you may not
be surprised to hear "What?" from Doug and Charlie but in fact Chris "On the
Computer Cutting Edge" Walter also said "What?" when he first heard about this.
(Actually, his laptop *is* a couple of months old now.) Here then, is what I
have learned from my Italian colleagues and some transparencies from a talk
at CERN about DLT tapes:
Capacity: 10 GB (but 20 GB versions are coming on the market)
Data transfer speed: 7 minutes/GB for 10 GB versions (20 GB's will be faster)
(Writing over Ethernet at the Gran Sasso has been mesaured
to slow them down to around 30 minutes/GB.)
Features: Cassette construction (similar to TA90)'s which permit easy
"robot" handling.
High-reliability (reported to be better than TA90's)
Works reliably in start/stop mode (non-streaming)
single drives, 5-7 tape stackers or robot libraries available
Originally produced by DEC, now also available from two independent
producers (Quantum and Storage-Works)
Available for DEC systems or using a standard SCSI interface for
UNIX and other systems.
Usage: It appears that DLT use is being slated for significant future use
at CERN.
DLT's are now being used for a reference copy by CLEO.
"DLT's are being supported by the INFN computer committee." A. Grillo
Our computer system manager at Caltech (Ching Shih) believes that
DLT's are in fact well on their way to being the next "industry
standard" for storage of quantities of data in the MACRO range.
He points out that there are now several manufacturers and that new
products are arriving on the market at a rapid pace.
Cost: (Lire prices are catalog or actual quotes for delivery in Italy.
Dollar prices are conversions based on 1600 Lire/dollar and divided
by 1.19 to remove VAT.)
10 GB DLT tape............................................70KL $37
"table top" single DLT drive with SCSI interface.........7.5ML $3.9K
rack-mounted " " " " " "...............6ML $3.2K
"table top" 5-tape stacker DLT drive with SCSI............13ML $6.8K
"cabinet" 7-tape stacker DLT drive (DEC)..................21ML $11.0K
rack-mounted " " " " "....................20ML $10.5K
The following are some quotes recently obtained by the Caltech
HEP computer system manager at Caltech. Note that prices vary due
to both manufacturer and features of drives (speed being the main
feature). Higher density drives are backwards compatible with lower
density.
10 GB DLT tape............................................. $37
15 GB DLT tape............................................. $40
10 GB SCSI "table top" single DLT drive.(TTI).............. $4.0K
15 GB SCSI "table top" single DLT drive.(TTI).............. $4.7K
20 GB SCSI "table top" single DLT drive.(TTI).............. $6.4K
10 GB SCSI "table top" 7 tape stacker (TTI)................ $11.2K
20 GB SCSI "table top" 7 tape stacker (TTI)................ $13.9K
15/30 GB SCSI "table top" single DLT drive (Quantum)....... $4.7K
20/40 GB SCSI "table top" single DLT drive (Quantum)....... $6.6K
15/30 GB SCSI "table top" 5 tape stacker (Quantum)......... $9.1K
15/30 GB SCSI 7 tape stacker (Quantum)..................... $11.1K
Future Options for Tape Distribution to U.S. Institutions
---------------------------------------------------------
Currently, U.S. MACRO institutions rely either on exabyte tapes, the
standard muon DST's or self-produced DST's of various types for data
distribution. Usually, self-produced DST's are produced in jobs at the end
of each run or from other jobs which run on the data on disk at LNGS and then
sent via the network to the producer's home institution. As far as I know,
all of the U.S. groups currently rely on exabyte tapes for long-term storage,
even for data which they may have copied via the network. Exabyte
tapes seem to be the current medium of choice for system backups at most
computer installations.
Starting now, we are entering a new era in dealing with data in MACRO. The
WFD data definitely will require some kind of adjustment as to how we deal
with our data distribution. There are several possible options and perhaps
some hybrid of the basic possibilities may make the most sense.
Our best estimate of the amount of data which we will be producing, unless we
take some measures which could limit our physics capabilities is 10 GB/week.
This is based on our current data acquisition including WFD's. Studies are
underway to understand how much this might be reduced without compromising
physics. However, I don't expect that there will be any significant decrease in
the amount of data which is actually read out. If all of this data is copied
onto tape, it corresponds to 260 exabyte tapes per year or 65 DLT tapes per
year (higher capacity exabyte and DLT tapes may in fact be available for a
factor of 2 reduction in the above numbers). The cost per bit for exabyte and
DLT appears to be similar so there is apparently no big difference in the tape
cost which is about $2000 per year per complete copy. I don't think this cost
will be devastating for any of us.
To me, the more serious concern is handling the data for analysis. At Caltech
(at least) we do not currently have the infrastructure to handle several
hundred to more than a thousand exabyte tapes for an analysis of a few years of
data. I expect that this is true of all of the U.S. institutions. Hence, we
will need some method of dealing with this data which will probably involve
both new hardware purchases and changes in the structure of data distribution.
Several scenarios that I can imagine follow:
1. Since most of us already have considerable investment in exabyte, we could
plan to continue to use exabyte tapes for distribution. In order to do this,
we could either:
A. Upgrade our tape-handling capacity both at the Gran Sasso and our
home institutions. At the Gran Sasso we would *at least* need one
and probably two new stacking exabyte drives plus a SCSI extender to
allow the additional drives. At our home institutions, we would
need stacking drives or many independent drives to permit automated
processing through many tapes at once. Finally, the mechanics of
dealing with this number of tapes will require some considerable
labor, currently not clearly available for long-term. The trend at
the laboratory is less rather than more logistical support for MACRO.
B. Process the data considerably prior to export from the Gran Sasso.
The full data would remain on the archival and reference data tapes
at the Gran Sasso and be accesible remotely at all times. Data exported
on exabyte tapes would be in one or more DST formats.
C. A mix of the above where institutions willing to invest money in the
hardware necessary to handle the tapes could receive full copies while
those unwilling to do so could receive the appropriate DST tapes.
Each of these options requires manpower to setup and since option C requires
both jobs A and B to be done, it requires the most work. Depending on the
preferences of the various U.S. institutions, insistence on the flexibility of
option C will require clear commitment of manpower to accomplish the required
setup.
2. Since we need to make some investment (hardware, manpower or both) to deal
with the new data, making the investment in the direction of DLT tapes rather
than exabyte may be attractive. The laboratory is making a clear move in this
direction as are at least a couple of our Italian collaborating institutions.
I would like to research more thoroughly how wide-spread this technology is
currently and the trend towards (or away) from it. If U.S. groups decide to
use this technology, no equipment investment at the Gran Sasso will be required
beyond what the Italians already plan and the labor for tape handling both at
the Gran Sasso and home institutions will be reduced considerably. Whether or
not we have to pay our share of a data technician for dealing with the tapes,
it will certainly cost less if there are fewer to handle. If the job is
sufficiently small, we may not be asked to make a contribution at all.
The cost of DLT drives is somewhat more expensive than for exabyte drives
but the cost for stacking systems is very comparable (presumably the stacker
is what drives the cost). If DLT's are an expanding technology (and that is
the picture I have of it now) the price could easily drop to being comparable
to exabyte in a short time. Perhaps exabyte cost could also drop but I expect
this would only happen either to try to extend a die-ing technology or because
it gets picked up as the backup medium of choice for the GB+ PC hard-disk
market. With reports that I have heard on new multi-hundred-MB floppy disks,
with relatively fast access time and $250 drives, I doubt that will happen.
It seems plausible to me that the investment required for U.S. institutions
to analyze the full data at their home institutions will be comparable between
DLT's and exabytes. If that is the case, DLT's would seem to be the more
sensible solution.
In the case that U.S. institutions adopt DLT's as the distribution medium for
full data, it could still be possible to provide reduced data sets on exabyte
tapes. The simplest approach would be to just strip WFD data off for writing
to exabyte. Alternatives which require manpower to setup could include some
processed version of WFD data (time and charge for instance) which is written
onto the exabyte versions. Even for this solution, new hardware investment
will be required at the Gran Sasso since the current system has no backup
and has been breaking semi-regularly.
3. Other technologies... all of these would require new hardware and manpower
investment. I think that we had better be pretty certain that these have
clear advantages prior to choosing to go one of these directions. A possible
"sufficient advantage" could be that the chosen medium appears to be the
"commercial standard" of the future rather than DLT's or that we already
have sufficient across-the-board investment in these in the U.S.
institutions that our total new hardware investment could be relatively
small.
A. high-density exabyte: (It is my understanding that units which hold up
to 10 GB per tape are available. Our system manager believes that due to
the single-maker insides that these are unlikely to remain competitive
with DLT... other opinions? Even though some of us may already have
some of these drives available at home, we would need to invest in new
drives at the Gran Sasso. My understanding is that a 10GB exabyte drive
is about a factor of three slower than a 10 GB DLT but is also currently
about a factor of 1.5-2 less expensive for the drive.
B. 4 mm DAT (Perhaps not a crazy possibility but what capacity already
exists? None at Gran Sasso. U.S. institutions? )
C. Writable optical disks (Still prohibitively expensive?)
D. other?
Suggested Decision Path
-----------------------
We need to make some important decisions about data distribution to the
U.S. institutions. Any solution is going to involve investment in both hardware
and manpower. We currently are operating under a temporary condition which
clearly cannot be sustained as *the* long-term approach. The faster we come
to a decision about how to manage the data, the better off we will be. Each
institution needs to answer (at least) the following questions so we can
understand what the spectrum of interests and needs will be. (Answers to be
available for other U.S. collaborators.):
1. Do you want a full copy (all waveform data) of the data? (Note that it
will not come for free under any circumstance.)
If you answer "no" to question 1:
2a: Are you satisfied with a data copy with the WFD data simply stripped
off?
3a: If you want pre-processing on WFD data, what kind of processing do you
want. (Remember that full WFD's will always be available at LNGS.)
4a: Are you interested in some kind of data split which would be performed
at the Gran Sasso and you receive only split data in which you are
interested. If so, what kind of split data do you want?
If you answer "yes" to question 1:
4b: Are you also interested in some kind of data split which would be
performed at the Gran Sasso and you receive split data in which you are
interested. If so, what kind of split data do you want? (Such a split
may include processed WFD data as well.)
------
5. What technology option(s) do you prefer to handle data which you want to
receive?
6. A sophisticated data split or pre-processed WFD data at the Gran Sasso is
going to take some considerable time and effort to setup. If you wish to
see this kind of solution, what manpower contribution can you forsee from
your institution to help do this work?