Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run interactive drunc session with example config from np04daq account #269

Open
bieryAtFnal opened this issue Oct 16, 2024 · 4 comments

Comments

@bieryAtFnal
Copy link
Contributor

When I try to do this, for example on np04-srv-003, I see errors like ApplicationLookupUnsuccessful: Could not resolve the URI for 'root-controller_control' in the connectivity service, got response []

Here are instructions for reproducing the tests that I ran...

# as user np04daq...

DATE_PREFIX=`date '+%d%b'`
TIME_SUFFIX=`date '+%H%M'`

source /cvmfs/dunedaq.opensciencegrid.org/setup_dunedaq.sh
setup_dbt latest_v5
dbt-create -n NFD_DEV_241016_A9 ${DATE_PREFIX}FDDev_${TIME_SUFFIX}
cd ${DATE_PREFIX}FDDev_${TIME_SUFFIX}/sourcecode

git clone https://github.com/DUNE-DAQ/daqsystemtest.git -b plasorak/no-thread-pinning
cd ..

dbt-workarea-env
dbt-build -j 20
dbt-workarea-env

mkdir rundir
cd rundir

source ~/bin/web_proxy.sh -u

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config boot wait 5 conf wait 3 start 101 enable-triggers wait 10 disable-triggers drain-dataflow stop-trigger-sources stop scrap terminate

drunc-unified-shell ssh-standalone config/daqsystemtest/example-configs.data.xml local-1x1-config boot wait 5 conf wait 3 start 102 enable-triggers wait 10 disable-triggers drain-dataflow stop-trigger-sources stop scrap terminate
@plasorak
Copy link
Collaborator

Starting the drunc with log level = debug yields error messages like:

                      STDOUT:
                    bash: line 1: /nfs/home/np04daq/NFD_DEV_241016_A9_plasorak/log_np04daq_local-1x1-config_hsi-01.log: cannot overwrite existing file


                      STDERR:                                                                                                                                                                                                   @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
                    @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
                    @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
                    IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
                    Someone could be eavesdropping on you right now (man-in-the-middle attack)!                                                                                                                                 It is also possible that a host key has just been changed.
                    The fingerprint for the ED25519 key sent by the remote host is
                    SHA256:MfI8N1nKxOfCgh1s2XcPpSvKblPN7V+WB5NbrWi8Afk.
                    Please contact your system administrator.
                    Add correct host key in /nfs/home/np04daq/.ssh/known_hosts to get rid of this message.                                                                                                                      Offending ED25519 key in /nfs/home/np04daq/.ssh/known_hosts:133

I'm afraid this isn't a problem with the run control.

@bieryAtFnal
Copy link
Contributor Author

I haven't been able to observe the HOST IDENTIFICATION HAS CHANGED message, but I have seen the following:

   STDOUT:                                                                                           
                    bash: line 1:                                                                                       
                    /nfs/home/np04daq/.biery/dunedaq/16OctFDDev_2222/rundir/log_np04daq_local-1x1-config_df-01.log:     
                    cannot overwrite existing file                                                                      
                                                                                                                        
                                                                                                                        
                      STDERR:                                                                                           
                    Address ::1 maps to localhost, but this does not map back to the address.                           
                    Address ::1 maps to localhost, but this does not map back to the address.                           
                    Connection to localhost closed.      

The presence of the 'cannot overwrite existing file" message in both of our screen captures got me wondering if removing existing log files from the current working directory would help. It did!

I can get reliable operation if I delete the log files between each running of the drunc_unified_shell.

It might be interesting to temporarily remove the "set -o noclobber" line from the np04daq account .bashrc to see if that helps, but I'm not willing to try that without coordinating with other people.

I tried running an fddaq-v4.4.8 system from the np04daq account on np04-srv-003, and I did not have a problem with a second set of log files overwriting the first. Not sure what nanorc would have been doing differently...

@plasorak
Copy link
Collaborator

A way to fix the issue is to add --no-override-logs to the boot command. That will generate logs that are timestamped.

@plasorak
Copy link
Collaborator

I tried running an fddaq-v4.4.8 system from the np04daq account on np04-srv-003, and I did not have a problem with a second set of log files overwriting the first. Not sure what nanorc would have been doing differently...

Depends if you are using nanorc or nano04rc, I would expect to see the same behaviour with the former, but with the later, logs go to /logs and get a timestamp (which is another thing we need to fix in drunc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants