Last update: 20-May-2006 CY.
First version: 12-Jan-2003 CY.
The GTT environment software provides the multi-threaded framework within which GTT algorithms run and supports histogram and statistics monitoring. The note is not complete and descibes only the most immediate concepts, but it should be useful for understanding the environment from the programmers viewpoint. The code developed has benifited from the ideas highlighted in the single threaded version which it replaces.
The contents are:
Text rendition is used to highlight items which have special significance:
Overview - tasks, peers and credits
A GTT environment is a multi-threaded program which receives event data from different component tracking detectors (on global first level trigger accepts), processes the data to find tracks and verticies, etc. and sends the results to the second level trigger. At a later time an accept or reject trigger decision is received from the global second level trigger and environment sends the associated event data and trigger result banks to the event builder or non, respectively, afterwhich the event is marked for deletion.
The environment interacts with two source and two sink network peers:
As more than one environment is running concurrently it is neccessary to implement strick rules defining which event is to be sent to which environment by the data sources. This has been implemented using an ordered list of credits. Each environment is assigned, on run startup, a unique credit (token) which is send to the fromgslt task when the environment is available for processing an event. The fromgslt task immediately broadcasts the received credit to the datasources so that these hold an ordered list of credits, envronments, available. On receiving data from an event, the credit at the head of the list is removed and the data sent to the associated environment.
Environments process events one-at-a-time, the credit is retransmitted to the fromgslt process only when a new event can be handled.
The GTT environment contains the following threads:
Note that all threads are started once during the lifetime of the environment, they are not started for each new event.
The psuedocode fragment below illustrates main's event handling loop.
/*
* Event loop
*/
while (1)
{
/*
* Algorithm, GetData threads are already waiting for the next event
*/
if (alg_mask != 0x0)
{
gttStartAlgo(alg_mask);
/*
* Wait for first header to define event start time essential before timed wait
* for algorith results !
*/
gttOrWait(src_mask, &result);
/*
* Timed wait on algorithms, remove credit if timeout (crash, etc.)
*/
if (gttAndWaitTimed(res_mask, 15000, 0x0) < 0) {
gttError("Algorithums not finished !");
Crashed = 1;
KillGttCredit(result.Gflt);
pause();
}
}
else
{
gttPrint("DummyAlgo called");
DummyAlgo(&result);
}
/*
* Stop the watchdog
*/
gttStopWatchDog();
/*
* Send SLT result to GSLT - this may already have been done by the watchdog
* SendAlgoResult then waits for completion of all algorithms and sources before
* creating and filling the gtt client buffer
*/
SendAlgoResult();
/*
* The GFLT number of the event is required for sending the credit.
* IT MUST BE extracted before clearing the current event.
*/
gttAndWait(GET_DATA_MVD0|GET_DATA_MVD1|GET_DATA_MVD2, &result);
/*
* Clear the current event by waiting for all data source and algotithm threads
*/
gttClearGetData();
/*
* Send the credit, must wait for a data source
*/
SendGttCredit(result.Gflt);
}
The principle features of the event loop are that a single event is processed at each passage through the loop and that the GetData and Algo threads process data from only one event before flagging completion and waiting to be re-triggered (cleared).
At the beginning of the loop the GetData threads are already trying to read event data. The Algo threads are then started and must wait for the data they require. This is accomplished by synchronizing to the completion of event readout of the GetData threads as explained in the next section. The completion of all the calculations is synchronized in a similar way and breaks Main's wait.
Watchdog operation is then cancelled to prevent a later time-limit expiration GTT trigger decision being sent to the GSLT.
The GTT trigger decision derived from the Algo results, provided a time-limit decision had not previously been sent, is then sent to the GSLT.
The current event is then cleared by: waiting for all GetData and Algo threads to complete, re-triggering GetData threads and enabling Algo starts.
The GTT processing credit is then sent.
Note that the GSLTresult thread is completely asynchronous to the event loop.
Do not move SendGttCredit infront of gttClearGetData as this will block the following event if a GetData or Algo thread, from the current event, cannot complete. The time-limit will not remedy this situation as it is current event specific. It is better to loose a processing environment than to block the GSLT although this will happen if all environments do not return their credit.
Synchronization mechanisums available are either functional, function calls which wait for completion of thread event operation, or callbacks which call functions when a thread has reached a given stage. These are described below.
OR and AND wait calls are provided for synchronizing to the completion of GetData and Algo event threads. The corresponding psuedocode is:
{
GTTWaitResult result;
gttAndWait (GET_DATA_MVD0|GET_DATA_MVD1|GET_DATA_MVD2|GET_DATA_CTD, &result);
printf ("returned from gttAndWait all MVD and CTD data for next event available\n");
gttOrWait (BARREL_ALGO|FORWARD_ALGO, &result);
printf ("returned from gttOrWait one or more Algorithms completed\n");
}
Threads to be waited on are identified by thread specific enumeration tags (GET_DATA_MVD0, GET_DATA_MVD1, GET_DATA_MVD2, GET_DATA_CTD, GET_DATA_STT, BARREL_ALGO, and FORWARD_ALGO) OR'd into a request mask.
To prevent indefinite blocking both functions internally AND the supplied request mask with the mask of configured GetData and Algo threads before performing the wait.
The functions return an integer containing the mask of event thread tags which satisfied the request. The return will be zero, and the function returns immediately, if the input mask was zero or contained no configured GetData or Algo threads. The AND function may return a mask not equal to the request due to the internal AND operation.
The result structure contains GetData or GetAlgo data coresponding to the lowest, see next section, thread which satisfied the wait.
This section describes the callback functions available and illustrates their use with psuedocode.
The FirstHeader callback is called, once per event, by the first GetData thread which reads an event message header which is the earliest time at which a new event is known about. The FirstHeader function might be used to enable the WatchDog, define the initial value of global variables, etc., eg.
void FirstHeader (GTTThread *Worker)
{
// printf ("FirstHeader called for Source %x,\n",Worker->Source);
first_src = Worker->Source;
// this is one place to put it, but maybe somewhere else is better
gttStartWatchDog ();
}
The ProcessData callback is called, more than once per event, by each GetData thread when it completes reading data. The callback can be used to perform additional data processing, algorithm preparation pointer assigment, etc., eg.
void ProcessData (GTTThread *Worker)
{
// printf ("ProcessData called for Source %x\n",Worker->Source);
// byte swap as data is network byte ordered on LynxOS XXX changes if Intel sourced XXX
apb_ntohl(Worker->Buffer);
// assign data pointers for CTD and other data source specific actions
switch (Worker->Source)
{
case GET_DATA_MVD0:
pd_mvd0 = Worker->Buffer;
mvd_bank_ptr[0] = (bank_def *)Worker->Buffer;
break;
case GET_DATA_MVD1:
pd_mvd1 = Worker->Buffer;
mvd_bank_ptr[1] = (bank_def *)Worker->Buffer;
break;
case GET_DATA_MVD2:
pd_mvd2 = Worker->Buffer;
mvd_bank_ptr[2] = (bank_def *)Worker->Buffer;
break;
case GET_DATA_CTD:
pd_ctd = Worker->Buffer;
break;
default:
break;
}
}
Barrel and forward algorithm callbacks
The algorithm routines BarrelAlgo and ForwardAlgo are callbacks, called once per event, into which the corresponding calculation code should be placed. This hides the implementation of the Algo thread from the algorithm calculation. This callback needs to call a synchronization function to ensure that the input data is complete, eg.
void *BarrelAlgo (void)
{
GTTWaitResult result;
void *Pointer = 0x0;
gttAndWait (GET_DATA_MVD0|GET_DATA_MVD1|GET_DATA_MVD2|GET_DATA_CTD, &result);
// printf ("returned from BarrelAlgo gttAndWait event %d\n", result.Gflt);
// printf("store start time of algorithm\n");
gttMarkTimeValue (GTT_VALUE|BARREL_VALUE|START_TIME_VALUE);
// printf ("Assign BarrelAlgo client buffer Gflt %d\n", result.Gflt);
if ((pd_dec=gttApbBufferPointer(barrel_cli, result.Gflt, &dsrc)) != 0x0) {
*pd_dec=0x8;
*(pd_dec+1)=GTT_ID;
Pointer = (void *)pd_dec;
} else {
printf("apb_find_buffer returned NULL (for the GSLT dec)\n");
ExitHandler(GTT_ERR_APB_ERR);
}
AlgoDecode ();
AlgoSetup ();
AlgoRithum (result.Gflt);
AlgoPlot ();
AlgoClear ();
// printf("store stop time of algorithm\n");
gttMarkTimeValue (GTT_VALUE|BARREL_VALUE|STOP_TIME_VALUE);
return Pointer;
}
void *ForwardAlgo (void)
{
GTTWaitResult result;
void *Pointer = 0x0;
int *pd;
gttAndWait (GET_DATA_MVD0|GET_DATA_MVD1|GET_DATA_MVD2|GET_DATA_STT, &result);
// printf ("returned from ForwardAlgo gttAndWait event %d\n",result.Gflt);
// printf("store start time of algorithm\n");
gttMarkTimeValue (GTT_VALUE|FORWARD_VALUE|START_TIME_VALUE);
// printf ("Assign ForwardAlgo client buffer Gflt %d\n", result.Gflt);
if ((pd=gttApbBufferPointer(forward_cli, result.Gflt, &dsrc)) != 0x0) {
*pd=0x8;
*(pd+1)=GTT_ID;
Pointer = (void *)pd;
} else {
printf("apb_find_buffer returned NULL (for the GSLT dec)\n");
ExitHandler(GTT_ERR_APB_ERR);
}
// printf("store stop time of algorithm\n");
gttMarkTimeValue (GTT_VALUE|FORWARD_VALUE|STOP_TIME_VALUE);
return Pointer;
}
The ClearEvent callback function can be used to perform algorithm operations such as histogramming which can be deferred until after the trigger decision is sent. ClearEvent is called, once per event, from gttClearGetData after synchronizing to all GetData and Algo threads but before clearing the event buffer list and re-triggering and enabling GetData and Algo threads. It is guaranteed that the GetData event data and synchronization thread result structures, including the Algo return pointer data, are valid.
void ClearEvent (void)
{
int Value0, Value1
printf ("ClearEvent called\n");
gttGetValue(GTT_VALUE|FIRST_TIME_VALUE, &Value0);
if (gttGetValue(GTT_VALUE|TIME_LIMIT_VALUE, &Value1)) plot_fill (11, (float)(Value1 - Value0)/1000., 0., 1.);
if (gttGetValue(GTT_VALUE|ALGO_TIME_VALUE, &Value1)) plot_fill (12, (float)(Value1 - Value0)/1000., 0., 1.);
if (gttGetValue(GTT_VALUE|RESULT_TIME_VALUE, &Value1)) plot_fill (13, (float)(Value1 - Value0)/1000., 0., 1.);
if (gttGetValue(GTT_VALUE|TOTAL_TIME_VALUE, &Value1)) plot_fill (14, (float)(Value1 - Value0)/1000., 0., 1.);
}
Gslt accept and reject callbacks
The GsltAccept and GsltReject callback functions are called once per event - not synchrously with the current event when the GSLT trigger result is received. Which function is called depends on whether the trigger result is accept or reject. The information available within these functions is limited to:
void GsltAccept(int Gflt)
{
if (pd_gslt_barrel != 0x0)
{
// AlgoSLTPlot(); // must unpack the result pointer data
}
if (pd_gslt_forward != 0x0)
{
}
}
void GsltReject(int Gflt)
{
if (pd_gslt_barrel != 0x0)
{
// AlgoSLTPlot(); // must unpack the result pointer data
}
if (pd_gslt_forward != 0x0)
{
}
}
Passing results between threads
This section describes how results from GetData and Algo threads event processing are made available to other threads.
As results are available only when the thread is synchronizable (current event processing is complete) it is natural to provide access using the synchronization functions described above. On completion these functions fill out the result structure pointed to by the second parameter (if not NULL). The structure is currently defined to be:
struct GTTWaitResult {
int Source; /* Tag of thread eg. GET_DATA_MVD0, BARREL_ALGO, etc. */
int Count; /* Thread sequential allocation number */
int Gflt; /* GFLT number of event */
int Mode;
int Type;
int Time;
int User1;
int User2;
int Bytes;
void *Pointer; /* Pointer to Algorithm result structure */
};
typedef struct GTTWaitResult GTTWaitResult;
If the function request mask specifies more than one thread which is synchronizable the results from the thread with the lowest Source number is returned. Note that if the function returns zero then Source is set to 0.
The algorithm result Pointer is returned by the algorithm callback function. Note that the Algo threads fill only the Source, Gflt and Pointer values all others are GetData specific.
The environment uses a number of global variables, these are defined in the following table
| Name | Type | Content | Scope | Write thread |
|---|---|---|---|---|
| src_mask | int | mask of active data thread identifiers | always | Main |
| algo_mask | int | mask of active algo threads identifiers | always | Main |
| first_src | int | mask of first data thread ready | FirstHeader - ClearEvent | Data |
| ready_src | int | mask of data threads ready | FirstHeader - ClearEvent | Data |
| cutoff_src | int | mask of data threads with cutoff | FirstHeader - ClearEvent | Data |
| Flag3_barrel | int | Flag3 barrel bits for GTBEVT | FirstHeader - ClearEvent | Barrel |
| Flag3_result | int | Flag3 result bits for GTSB and GTBEVT | FirstHeader - ClearEvent | Main or Timer |
| Crashed | int | Algorithms did not finish | always | Main |
Care should be taken when using globals. The content of a global used within a single thread is simple to understand, its value changes as in a conventional single threaded program. The content of globals shared between threads are difficult to define, as a function of time, and usually some form synchronzation (mutex, callback, thread state known not to change variable, etc.) is required. The Flag3_result global is filled when data is sent to the GTT by the main thread, gttSendAlgoResult, or when the timer expires, gttSendTimeLimitResult, and a mutex is used to ensure that it is filled of a first come basis.
A simple way of defining the initial per event content global is to set it's value in the FirstHeader callback.
The purpose of the time-limit is to enforce a defined maximum time delay on sending the trigger result, and was introduced to reduce the effect of rare pathological events whose algorithm processing time is large on the trigger latency.
The time-limit is implemented by the WatchDog thread which is enabled in the FirstHeader callback and is cancelled after the algorithm processing is finished. If the time-limit expires before cancellation a pass trigger decision is sent as descibed in the next session.
During processing algorithms should periodically test the status of the time-limit
if (gttAbortAlgo() != 0x0)
{
printf ("Algorithm processing ending as Abort flag set\n");
}
and stop processing the event quickly creating the result banks
with flags showing that processing was terminated prematurely - if this was the
case.
Getdatas do not test the status of time-limit. If data for an event does not arrive the test of event data ready inside GSLTresult will fail as described in the receiving GSLT trigger decision and error handling sections.
On expiration of the time-limit the Watchdog thread makes the callback, at most once per event,
SendGttResult (0, &result);to send the result banks described in the next section.
The trigger result sent to the GSLT can be initiated by time-limit expiration (WatchDog) or the completion of algorithm processing (Main) by calling a thread specific jacket routine (SendTimeLimitResult and SendAlgoResult, respectively). These routines perform any synchronization/definition work required before sending the result to the GSLT via SendGttResult. Note the the SendGttResult sends a result only once per event.
The result sent to the GSLT always consists of the full set of trigger result banks, currently: GTSB (barrel) and GTSF (forward). Always sending the maximum set of banks simplifies their handling at the GSLT and is useful at the GTT since it is less likely that the maximum 256 word result payload be exceeded. If an algorithm is disabled or finishes after the time-limit the bank contains flags identifying what happened - banks with this content are called no-shows. If the dummy algorithm function is called for non EVT trigger type no-show banks are sent. What banks are sent when is listed below:
The contents of the GSTB and GTSF banks are listed below.
GTSB
= ( NVert = INTE : 'number of vertices',
PrVert(3) = REAL : 'primary vertex position x,y,z',
PVertWdth(3) = REAL : 'primary vertex width x,y,z',
Ntrax = INTE : 'number of tracks',
NAxtrax = INTE : 'number of axial-only tracks',
NVtxtrax = INTE : 'number of tracks used in vertex',
Nwts = INTE : 'number of z weights',
Nvtxwts = INTE : 'number of z weights on the vertex',
NCTDtrax = INTE : 'number of tracks without MVD hits',
Flag1 = INTE : 'barrel algorithm bits',
Flag2 = INTE : 'background bits',
Flag3 = INTE : 'environment bits, time-limit etc.',
Flag4 = INTE : 'spare bits'
PT(2) = REAL : 'pt of highest two pt tracks')
STATIC
:'Summary of barrel algorithm processing';
GTSF
= (
)
:'Summary of forward algorithm processing';
GTSB and GTSF are not defined in the online DDL.
Receiving GSLT trigger decisions
On receiving a GSLT trigger decision the following actions are performed by GSLTresult:
Sending data to the Event Builder
The GsltResult thread, currently gttGsltTrigger(), handles the GSLT trigger decision, sending any result banks to the EVB.
MVD cluster data, read by the GET_DATA_MVD0, GET_DATA_MVD1 and GET_DATA_MVD2 GetData threads, Algorithm results pointed to by result.Pointer, and gtt environment information are sent to the EVB on GSLT accept decisions.
The gtt environment is the last data to be allocated and filled. Delaying the allocation and protecting it with a mutex prevents GSLTresult sending the current event before it is complete.
The environment sends a single bank of integer Index-Value pairs stored in a table which is automatically cleared by gttClearGetData after writing the contents to the gtt environment data buffer. Values are added to the bank by:
// store the time offset value (GET_DATA_MVD2 read started w.r.t. clearing the event) in micro-secs with the specified Index
gttMarkTimeValue (GTT_VALUE|GET_DATA_MVD2|START_TIME_VALUE);
// store the GET_DATA_MVD2 GetData bytess read value with the specified index
gttAddIntValue (GTT_VALUE|GET_DATA_MVD2|BYTES, result.Bytes);
The values are retrieved and operated on in the ClearEvent call back and used
in filling diagnostic histograms:
//printf ("ClearEvent called\n");
// use time of arrival of first header as the start value
gttGetValue(GTT_VALUE|FIRST_TIME_VALUE, &Value0);
// plot the time delay before sending the trigger result
if (gttGetValue(GTT_VALUE|RESULT_TIME_VALUE, &Value1)) plot_fill (11, (float)(Value1 - Value0)/1000., 0., 1.);
The GTT_VALUE bit mask specifies that this Index is set by the
environment.
When the gtt environment bank is created only indicies
with the GTT_VALUE set are used.
If algorithms want to use this mechanism they should use BARREL_VALUE and FORWARD_VALUE bit masks when storing values and append their banks at the end of their result.Pointer data with a valid name (BANK_ID), ie.
*pd = 0x8;
*(pd+1) = GTT_ID;
*pd += gttBankValue (BANK_ID, BARREL_VALUE, &pd[2]);
The list of currently used GTT_VALUE indicies is tabulated below (all times are micro-seconds since last clear).
| Index | Value units | Purpose |
|---|---|---|
| GTT_VALUE|FIRST_TIME_VALUE | micro-secs | read first header time |
| GTT_VALUE|RESULT_TIME_VALUE | micro-secs | send trigger time |
| GTT_VALUE|TIME_LIMIT_VALUE | micro-secs | time-limit expired |
| GTT_VALUE|ALGO_TIME_VALUE | micro-secs | algorithm (all) result time |
| GTT_VALUE|TOTAL_TIME_VALUE | micro-secs | ClearEvent entry time |
| GTT_VALUE|Worker->Source|START_TIME_VALUE | micro-secs | GetData header arrival or Algo called |
| GTT_VALUE|Worker->Source|STOP_TIME_VALUE | micro-secs | GetData data ready or Algo returned |
| GTT_VALUE|Worker->Source|DATA_BYTES | integer | GetData data bytes read |
| GTT_VALUE|Worker->Source|DATA_BYTES_PRECUT | integer | GetData precut data bytes |
| GTT_VALUE|EVENT_NUMBER | integer | Event number |
| GTT_VALUE|CREDIT_NUMBER | integer | Credit number |
| GTT_VALUE|Worker->Source|HOST_ADDR | integer | GetData IP address |
| GTT_VALUE|Worker->Source|HOST_PORT | integer | GetData IP port |
| GTT_VALUE|ALGO_RESULT | integer | Algo trigger result |
| GTT_VALUE|TIME_RESULT | integer | Time-limit trigger result |
This section describes how errors associated with event and trigger processing are handled. It does not describe the error handling performed within the Algorithms.
GTT environments process events one-at-a-time, which is guaranteed by the credit handling rules. If a GetData thread receives data with a different event number than that received by another GetData thread then the thread prints an error message to stderr and exits. This error condition is fatal and cannot be recovered.
If the processing of an event does not finish because a GetData or algorithm thread does not complete (due to a crash, a receive data protocol error, etc.) the second level trigger is not blocked because a time-limit trigger result will be sent by the Watchdog thread. A subsequent block of the MVD/GTT DAQ due to fromgslt not receiving the credit is prevented by the event processing loop of the Main thread timing out after 15 seconds whilst waiting for the algorithms to finish. This timeout causes the credit to be sent to the fromgslt process, but marked so that it is not rebroadcast to the GetData threads, which removes the environment from the GTT system - the environment's event handling pauses until the end of the run. The global second level trigger decision handling for the crashed event is performed, but no data or result banks are sent to the event-builder - only an empty GTENV bank is sent.
Signal handling or how to kill the environment
The GTT environment process is always started by a, MVD run control or simulation, daemon process which allocates a unique program group identifier (pgrp) equal to the pid of the first thread started. The pgrp identifier is used by the daemon when it kills the process, at the end of a run, which ensures that all user-thread processes are signalled.
If the pid, ppid and pgrp of the environment are displayed using ps:
$ ps -eo pid,ppid,pgrp,pmem,pcpu,user,cmd --sort pgrp PID PPID PGRP %MEM %CPU USER CMD 1990 1884 1990 6.0 0.0 youngman gtt_env.exe 2001 1990 1990 6.0 0.0 youngman gtt_env.exe 2002 2001 1990 6.0 0.0 youngman gtt_env.exe 2003 2001 1990 6.0 0.0 youngman gtt_env.exe 2004 2001 1990 6.0 0.0 youngman gtt_env.exe 2005 2001 1990 6.0 0.0 youngman gtt_env.exe 2006 2001 1990 6.0 0.0 youngman gtt_env.exe 2007 2001 1990 6.0 0.0 youngman gtt_env.exe 2008 2001 1990 6.0 0.0 youngman gtt_env.exe 2009 2001 1990 6.0 0.0 youngman gtt_env.exe 2010 2001 1990 6.0 0.0 youngman gtt_env.exeThe correct way for a user process to kill (SIGTERM=15) the above environment is:
kill -15 -1990
All environment process threads block all unix signals except the shutdown thread which waits on sigwait until a signal arrives after which the shutdown thread carries out all operations required (e.g. buffer status printout, etc.) before calling exit(caught signal#). All non shutdown threads call pthread_exit() if they detect an error condition which prevents further operation. Note that all threaded operations (blocking, pthread_exit, synchronization, etc.) are performed by the environment. Algorithm threads should contain no thread function calls.
In the unlikely event that the environment has been modified incorrectly and all threads do not die cleanly, when signalled, the MVD run control executes an environment killer script which SIGKILL's (-9) any environment threads still alive after 15 seconds.
The BarrelAlgo authors use gettimeofday() timer information to investigate the performance of the algorithm. The gettimeofday function is almost certainly thread-safe, but the gttGetTime function definitely is:
struct timeval Time;
gttGetTime (&Time);
The event data buffer functions of Alessadro are used protected by mutexes
which prevent the dsrc structure being changed during use.
Thread monitoring histograms are booked in the PlotBook function and filled in ClearEvent. Two standard layouts are provided gtt_data_source_summary.lay (summarizes GetData data sizes and data ready delays w.r.t. FirstHeader) and gtt_timing_summary.lay (Send trigger result and algorithm ready delays w.r.t. FirstHeader).
The plot filling renetrancy is an issue. If all plots being filled are in a single directory then the filling is reentrant. It is not reentrant if the directory is being changed by different threads.
Rules to be observed when calling plotting functions:
Plots are forwarded from shared memory using standard plot_server.exe jobs.
Statistics concerning the environment and algorithm are contained in integer and real number array the definition of the array contents are contained in gtt_codes.h. Sending statistics allows run summary and online monitoring to be performed.
The current statistics definition is:
enum GTT_ENV_STAT_I {
GTT_ENV_IN = 0, // Events where processing started
GTT_ENV_OUT, // Events where processing finished
GTT_ENV_EXIT, // Exit status
GTT_ENV_GFLT, // GFLT number of last event processed
GTT_ENV_SOURCES, // Bit mask of data sources in run
GTT_ENV_RECEIVED, // Bit mask of data sources processed
GTT_ENV_MODE, // Mode of operation
GTT_ENV_PRESCALE, // Prescale factor used
GTT_ENV_CREDIT, // Credit number
GTT_ENV_TIME, // Events with time-limit trigger result
GTT_ENV_ALGO, // Events with algo trigger result
GTT_ENV_I_LEN // length of array (this must be the last entry of the enum)
};
typedef enum GTT_ENV_STAT_I GTT_ENV_STAT_I;
enum GTT_ENV_STAT_F {
GTT_ENV_COURSE_MEAN_LAT = 0, // mean processing latency (ms)
GTT_ENV_COURSE_MAX_LAT, // max processing latency (ms)
GTT_ENV_COURSE_CUR_LAT, // processing latency of last event (ms)
GTT_ENV_CTD_MEAN_SIZE, // mean size CTD data (Bytes)
GTT_ENV_CTD_MAX_SIZE, // max size CTD data (Bytes)
GTT_ENV_CTD_CUR_SIZE, // CTD data size of last event (Bytes)
GTT_ENV_FMVD_MEAN_SIZE, // as above but for FMVD
GTT_ENV_FMVD_MAX_SIZE, // as above but for FMVD
GTT_ENV_FMVD_CUR_SIZE, // as above but for FMVD
GTT_ENV_BMVD_MEAN_SIZE, // as above but for BMVD
GTT_ENV_BMVD_MAX_SIZE, // as above but for BMVD
GTT_ENV_BMVD_CUR_SIZE, // as above but for BMVD
GTT_ENV_STT_MEAN_SIZE, // as above but for STT
GTT_ENV_STT_MAX_SIZE, // as above but for STT
GTT_ENV_STT_CUR_SIZE, // as above but for STT
GTT_ENV_F_LEN // length of array (this must be the last entry of the enum)
};
typedef enum GTT_ENV_STAT_F GTT_ENV_STAT_F;
Statistics are forwarded from shared memory using standard gtt_stat_server.exe jobs.
Thread configuration and startup is performed in StartJob and sizes according to which data sources are to be expected. Line order and position (at the end) should not be changed.
The time-limit period (in milli-seconds) and function to call are defined in the gttSetWatchDog function call. If the function name is replaced by 0x0 (NULL) the call is not made on time-limit expiration although other time-limit dependent features like gttAbortAlgo behave correctly. The observed expiration period cannot be set lower than 10ms and setting larger values results in observed periods at the next 10ms boundary (setting 40ms gives 50ms, etc.).
gttInitThread (10,GetData);
gttSetBuffer (&dsrc);
if (n_mvd > 0)
gttOpenThread (mvd_sck[0],GET_DATA_MVD0,(void *)gttGetData,(void *)ProcessData);
if (n_mvd > 1)
gttOpenThread (mvd_sck[1],GET_DATA_MVD1,(void *)gttGetData,(void *)ProcessData);
if (n_mvd > 2)
gttOpenThread (mvd_sck[2],GET_DATA_MVD2,(void *)gttGetData,(void *)ProcessData);
if (ctd.nr != 0x0)
gttOpenThread (ctd_sck,GET_DATA_CTD,(void *)gttGetData,(void *)ProcessData);
if (stt.nr != 0x0)
gttOpenThread (stt_sck,GET_DATA_STT,(void *)gttGetData,(void *)ProcessData);
gttOpenThread (gslt_gtt_res_sck,GSLT_RESULT,(void *)gttGsltTrigger,NULL);
gttOpenThread (0,WATCHDOG,(void *)gttWatchDog,NULL);
gttSetWatchDog (30, (void *)SendTimeLimitResult); // send time-limit result after Nms
gttSetFirstHeader ((void *)FirstHeader); // call FirstHeader() on receiving first data header message
gttSetClearEvent ((void *)ClearEvent); // call ClearEvent() on gttClearGetData
if ((ctd.nr != 0x0) && ((GTT_MODE_ALG>t_mode) != 0x0) && ((GTT_MODE_NOBARREL>t_mode) == 0x0) && (n_mvd == 3)) {
gttTell ("BARREL_ALGO enabled");
AlgoInit();
gttOpenThread (0,BARREL_ALGO,(void *)gttCallAlgo,BarrelAlgo);
}
if ((stt.nr != 0x0) && ((GTT_MODE_ALG>t_mode) != 0x0) && ((GTT_MODE_NOFORWARD>t_mode) == 0x0)) {
gttTell ("FORWARD_ALGO enabled");
FAlgoInit();
gttOpenThread (0,FORWARD_ALGO,(void *)gttCallAlgo,ForwardAlgo);
}
gttTell ("Thread statistics");
gttPrintThread ();
The environment can be configured to dump events, in xdr format, from the event data associated with the various threads. The following files were filled with data from the GetData threads.
mvddaq@bgtt10:~> ls -l /events/run total 592 -rw-rw-rw- 1 mvddaq users 176200 2003-12-08 15:29 ctd.xdr -rw-rw-rw- 1 mvddaq users 268664 2003-12-08 15:29 mvd0.xdr -rw-rw-rw- 1 mvddaq users 102904 2003-12-08 15:29 mvd1.xdr -rw-rw-rw- 1 mvddaq users 47496 2003-12-08 15:29 mvd2.xdrBy using a directory on the host bgtt machine unnecessary nfs-network activity is avoided.
Dump activity is steered by:
export $GTTDATADUMPS="5 10 100 1 99 /events/run/"
dumps
A simple script has been written to allow management of farm environments. This should be run from the mvddaq account on mvddaq using the syntax:
manager.script username blank-separated-host-list pathname-of-script/executableAt login a number of host lists are defined as environmentals:
| Environmental | Scope |
|---|---|
| SSHNODES | all linux MVD online nodes |
| RSHNODES | all VME MVD online nodes |
| ENVNODES | all GTT farm nodes running environments |
| GTTNODES | all GTT fram nodes including non environment control node |
| DAQNODES | all DAQ VME MVD nodes |
As an example the script can be used to show the date on all the GTT nodes:
manager.script username "$ENVNODES" "date"or check the status of daemons by running a user script on two nodes:
manager.script mvddaq "bgtt01 bgtt11" "$BINDIR/check_daemons.script" ssh-target bgtt01 This host bgtt01 Restart script: /mvd_code/vers/mvd_suse/bin/start_bgtt01.script Daemon PID 19573 ssh-target bgtt11 This host bgtt11 Restart script: /mvd_code/vers/mvd_suse/bin/start_bgtt11.script Daemon PID 884The environmental $BINDIR is defined by the login script to point to the standard MVD executable directory. Currently defined user scripts are:
| Filename | Action |
|---|---|
| check_daemons.script | print pid of daemons and check start_NODE.script presence |
| stop_daemons.script | stop daemons and their children |
| start_daemons.script | start daemons using start_NODE.script |
| restart_daemons.script | stops and starts daemons |
| delete_empty_event_dirs.script | deletes empty event dump directories |
| summary_event_dirs.script | summarize evemt dump directories |
| clean_event_dirs.script N | deletes event dump dirs older than N days |
| append_event_dirs.script R1 R2 Append/ | appends event dump
directory files into append directory. The user must ensure that the same number of files are appended to, otherwise and event#/data mismatch will occur. manager.script mvddaq "$ENVNODES" "$BINDIR/append_event_dirs.script 48741 48743 /mvd_code/vers/mvd_suse/log/append" |
| append_checked_dirs.script R1 R2 Append/ 'list of source xdr files' | appends event dump
directory files into append directory if the number of events in each file
is that same and greater than 0 manager.script mvddaq "$ENVNODES" "$BINDIR/append_checked_dirs.script 56318 56331 /mvd_code/vers/mvd_suse/log/append 'ctd.xdr ctdz.xdr mvd0.xdr mvd1.xdr mvd2.xdr'" |
Example of using append_checked_dirs.script
mvddaq@mvddaq:/mvd_code/vers/mvd_suse/log/append> manager.script mvddaq "$ENVNODES" "$BINDIR/append_checked_dirs.script 56318 56331 /mvd_code/vers/mvd_suse/log/append 'ctd.xdr ctdz.xdr mvd0.xdr mvd1.xdr mvd2.xdr'" ... ssh-target bgtt02 This host bgtt02 Target /events/56318 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56321 ctd.xdr[2] ctdz.xdr[2] mvd0.xdr[2] mvd1.xdr[2] mvd2.xdr[2] Target /events/56322 ctd.xdr[33] ctdz.xdr[33] mvd0.xdr[33] mvd1.xdr[33] mvd2.xdr[33] Target /events/56323 ctd.xdr[20] ctdz.xdr[20] mvd0.xdr[20] mvd1.xdr[20] mvd2.xdr[20] Target /events/56324 ctd.xdr[21] ctdz.xdr[21] mvd0.xdr[21] mvd1.xdr[21] mvd2.xdr[21] Target /events/56325 ctd.xdr[75] ctdz.xdr[75] mvd0.xdr[75] mvd1.xdr[75] mvd2.xdr[75] Target /events/56328 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56329 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56330 ctd.xdr[20] ctdz.xdr[20] mvd0.xdr[20] mvd1.xdr[20] mvd2.xdr[20] Target /events/56331 ctd.xdr[102] ctdz.xdr[102] mvd0.xdr[102] mvd1.xdr[102] mvd2.xdr[102] Directory count present/missing 7/4. Total 35/273 files/events appended. ssh-target bgtt03 ... This host bgtt12 Target /events/56318 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56321 ctd.xdr[1] ctdz.xdr[1] mvd0.xdr[1] mvd1.xdr[1] mvd2.xdr[1] Target /events/56322 ctd.xdr[33] ctdz.xdr[33] mvd0.xdr[33] mvd1.xdr[33] mvd2.xdr[33] Target /events/56323 ctd.xdr[20] ctdz.xdr[20] mvd0.xdr[20] mvd1.xdr[20] mvd2.xdr[20] Target /events/56324 ctd.xdr[21] ctdz.xdr[21] mvd0.xdr[21] mvd1.xdr[21] mvd2.xdr[21] Target /events/56325 ctd.xdr[75] ctdz.xdr[75] mvd0.xdr[75] mvd1.xdr[75] mvd2.xdr[75] Target /events/56328 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56329 ctd.xdr[] ctdz.xdr[] mvd0.xdr[2] 2 != - skip Target /events/56330 ctd.xdr[20] ctdz.xdr[20] mvd0.xdr[20] mvd1.xdr[20] mvd2.xdr[20] Target /events/56331 ctd.xdr[102] ctdz.xdr[102] mvd0.xdr[102] mvd1.xdr[102] mvd2.xdr[102] Directory count present/missing 7/4. Total 35/272 files/events appended.
Using TI to look at banks, here GTENV, offline requires you have access to a raw (or other) data file.
ssh youngman@zenith207
ls -l /osm/pnfs/zeus/z/raw/04/D040524/
/zeus/bin/dccp /osm/pnfs/zeus/z/raw/04/D040524/RAW.D040524.T192942.R049809A tmp.tmp
/zeus/bin/ti
TI> gaf/open i name=tmp.tmp,filfor=EXCH
TI> n i (5-6 times to skip BOR and TST trigger events)
TI> acc i
TI> tab/li (list all tables and their lengths)
TI> option/set/format GTENV Index i10 (%10d integer format)
TI> option/set/format GTENV Value i10
TI> tab/pri all GTENV
|-------------------------------------------|
| Table: GTENV ADAMO/TAP |
| Count: 49 |
| Page ( 1, 1) |
| Printed along: ID [MINC,MAXC] |
|-------------------------------------------|
|ID |Index >>Hex<< |Value >>Unit<<|
|----|---------- |---------- |
| 1| 269418496 0x100f0000| 21 us | TIME_RESULT
| 2| 268501248 0x10010100| 62 us | START_TIME_VALUE FORWARD_ALGO
| 3| 268501120 0x10010080| 81 us | START_TIME_VALUE BARREL_ALGO
| 4| 268500996 0x10010004| 32537 us | START_TIME_VALUE GET_DATA_MVD2
| 5| 269221892 0x100c0004|********** 32 bit | HOST_ADDR GET_DATA_MVD2
| 6| 269287428 0x100d0004| 40547 int | HOST_PORT GET_DATA_MVD2
| 7| 268632064 0x10030000| 32537 us | FIRST_TIME_VALUE
| 8| 269090816 0x100a0000| 261413 int | EVENT_NUMBER
| 9| 269156352 0x100b0000| 6 int | CREDIT_NUMBER
| 10| 268959748 0x10080004| 704 int | DATA_BYTES GET_DATA_MVD2 'wheel data bytes received'
| 11| 268447748 0x10003004| 704 int | DATA_BYTES_PRECUT GET_DATA_MVD2 'wheel data bytes precut'
| 12| 268566532 0x10020004| 32725 us | STOP_TIME_VALUE GET_DATA_MVD2
| 13| 268500994 0x10010002| 32738 us | START_TIME_VALUE GET_DATA_MVD1
| 14| 269221890 0x100c0002|********** 32 bit | HOST_ADDR GET_DATA_MVD1
| 15| 269287426 0x100d0002| 40546 int | HOST_PORT GET_DATA_MVD1
| 16| 268959746 0x10080002| 1024 int | DATA_BYTES GET_DATA_MVD1 'lower barrel data bytes received'
| 17| 268447746 0x10003002| 1072 int | DATA_BYTES_PRECUT GET_DATA_MVD1 'lower barrel data bytes precut'
| 18| 268500993 0x10010001| 32776 us | START_TIME_VALUE GET_DATA_MVD0
| 19| 269221889 0x100c0001|********** 32 bit | HOST_ADDR GET_DATA_MVD0
| 20| 269287425 0x100d0001| 40545 int | HOST_PORT GET_DATA_MVD0
| 21| 268959745 0x10080001| 112 int | DATA_BYTES GET_DATA_MVD0 'upper barrel data bytes received'
| 22| 268447745 0x10003001| 2992 int | DATA_BYTES_PRECUT GET_DATA_MVD0 'upper barrel data bytes precut'
| 23| 268566529 0x10020001| 32791 us | STOP_TIME_VALUE GET_DATA_MVD0
| 24| 268566530 0x10020002| 33038 us | STOP_TIME_VALUE GET_DATA_MVD1
| 25| 268501008 0x10010010| 36815 us | START_TIME_VALUE GET_DATA_STT
| 26| 269221904 0x100c0010|********** 32 bit | HOST_ADDR GET_DATA_STT
| 27| 269287440 0x100d0010| 40544 int | HOST_PORT GET_DATA_STT
| 28| 268959760 0x10080010| 1736 int | DATA_BYTES GET_DATA_STT 'STT data bytes received'
| 29| 268447760 0x10003010| 1624 int | DATA_BYTES_PRECUT GET_DATA_STT 'STT data bytes precut'
| 30| 268566544 0x10020010| 37046 us | STOP_TIME_VALUE GET_DATA_STT
| 31| 269549568 0x10110000| 37061 us | FORWARD_VALUE START_TIME_VALUE
| 32| 269615104 0x10120000| 37210 us | FORWARD_VALUE STOP_TIME_VALUE
| 33| 268566784 0x10020100| 37211 us | STOP_TIME_VALUE FORWARD_ALGO
| 34| 268501000 0x10010008| 38740 us | START_TIME_VALUE GET_DATA_CTD
| 35| 269221896 0x100c0008|********** 32 bit | HOST_ADDR GET_DATA_CTD
| 36| 269287432 0x100d0008| 40543 int | HOST_PORT GET_DATA_CTD
| 37| 268959752 0x10080008| 648 int | DATA_BYTES GET_DATA_CTD 'CTD data bytes received'
| 38| 268447752 0x10003008| 536 int | DATA_BYTES_PRECUT GET_DATA_CTD 'CTD data bytes precut'
| 39| 268566536 0x10020008| 38843 us | STOP_TIME_VALUE GET_DATA_CTD
| 40| 285278208 0x11010000| 38856 us | BARREL_VALUE START_TIME_VALUE
| 41| 285347840 0x11021000| 38888 us | BARREL_VALUE STOP_TIME_VALUE DECODED
| 42| 285351936 0x11022000| 38895 us | BARREL_VALUE STOP_TIME_VALUE PREPROCESS
| 43| 285343744 0x11020000| 38980 us | BARREL_VALUE STOP_TIME_VALUE
| 44| 268566656 0x10020080| 38981 us | STOP_TIME_VALUE BARREL_ALGO
| 45| 269025280 0x10090000| 39002 us | ALGO_TIME_VALUE
| 46| 268894208 0x10070000| 39010 us | RESULT_TIME_VALUE
| 47| 269352960 0x100e0000| 1 32 bit | ALGO_RESULT
0x102E0000| background word | BACKGROUND_REAL (floating point !!)
0x104E0000| zvertex word | ZVERTEX_REAL (floating point !!)
0x105E0000| PTsum word | PTSUM_REAL (floating point !!)
| 48| 270401536 0x101e0000| 3161090 32 bit | BARREL_FLAG3
| 49| 269352961 0x100e0001| 3161090 32 bit | ENV_FLAG3
|-------------------------------------------|
This is a placeholder.