Looping Shell Script Recipe

The following procedure creates and executes a looping script that executes a command with output to a file and a sleep in between each execution:

  1. If you're using SELinux security, then you'll have to place the script into an allowed location. First, check SELinux status:
    sestatus | grep "Current mode"
    Example output:
    $ sestatus | grep "Current mode"
    Current mode:                   enforcing
    1. If the sestatus command does not exist or Current mode is set to anything other than enforcing, then you can place the new script anywhere; for example:
      cd /tmp/
    2. If SELinux is set to enforcing, then place the new script into an allowed directory (may require root permission); for example, a common such directory:
      cd /usr/local/bin/
  2. Create a file named diagscript.sh replacing the command after the CHANGEME comment, and, if necessary, the sleep time and the output location instead of /tmp/:
    #!/bin/sh
    set -e
    # Change output directory if needed
    outputfile="/tmp/diag_$(hostname)_$(date +"%Y%m%d_%H%M%S").txt"
    while true; do
      echo "[$(date)] Executing iteration" >> "${outputfile}" 2>&1
    
      # CHANGEME: Change the `ps -elfyww` command to your desired command (but make sure to still write to ${outputfile}):
      ps -elfyww >> "${outputfile}" 2>&1
    
      # Change the sleep time (in seconds) as needed:
      sleep 30
    done
  3. Manually confirm the looped command exists and works (as some distributions support different flags). Using the example command above:
    ps -elfyww
  4. Make the script executable:
    chmod +x diagscript.sh
  5. If SELinux is set to enforcing, change the security context of the script based on a well-known executable; for example:
    chcon --reference=/usr/bin/cat diagscript.sh
  6. Check if you're running in a systemd environment:
    systemctl status
    1. If systemctl status succeeds (running in a systemd environment), start the script with systemd-run so that it's not interrupted if the user that started the script logs out:
      systemd-run ./diagscript.sh
      Example output:
      Running as unit: run-r973b3f9d949b4d60a6b9a5fc842b84d1.service; invocation ID: 800d52ff3cb541659df667804ae7b149
      Confirm the script is running by running systemctl status on the dynamically generated unit name and confirm it's active (running); for example:
      # systemctl status run-r973b3f9d949b4d60a6b9a5fc842b84d1.service
         [...]  
         Active: active (running) since Tue 2025-01-14 08:41:28 CST; 54s ago
         Main PID: 2686 (diagscript.sh)
    2. If systemctl does not exist (not running in a systemd environment), start the script with nohup so that it's not interrupted if the user that started the script logs out and send it to the background with &:
      nohup ./diagscript.sh &
  7. If you'd like to watch the script output, you may tail it (replacing /tmp/ if you used a different directory):
    tail -f /tmp/diag*txt
  8. Monitor that the script output does not exhaust disk
  9. When you are ready to stop the script:
    1. If the script was started with systemd-run, stop it using the dynamically generated service name printed when starting it (or search in systemctl status output); for example:
      systemctl stop run-r973b3f9d949b4d60a6b9a5fc842b84d1.service
    2. If the script was started with nohup, kill it by finding the process ID:
      kill ${PID}
      Alternatively, if available, use pkill with the script name:
      pkill -f diagscript.sh
  10. Upload diag*txt and any other relevant logs