athena

ATHENA Toolbox

ATHENA (Automatically Tracking Hands Expertly with No Annotations) is a Python-based toolbox designed to process multi-camera video recordings, extract 2D and 3D body and hand landmarks using MediaPipe, and perform triangulation and refinement of these landmarks. The toolbox provides a user-friendly GUI for selecting videos and configuring processing options.

ATHENA Github Link

logo GUI screenshot

Features

Installation

Prerequisites

Installation Steps

Use your package manager of choice to create an environment with Python 3.12. For example, using conda:

conda create -n athena python=3.12

Activate the environment:

conda activate athena

Then install the package:

pip install athena-tracking

Or to get the latest version directly from GitHub:

pip install git+https://github.com/neural-control-and-computation-lab/athena.git

Usage

1. Organize Your Videos

Place your synchronized video recordings in a main folder, structured as follows:

main_folder/
├── videos/
│   ├── recording1/
│   │   ├── cam0.avi
│   │   ├── cam1.avi
│   │   └── ...
│   ├── recording2/
│   │   ├── cam0.avi
│   │   ├── cam1.avi
│   │   └── ...
└── calibration/
    ├── cam0.yaml
    ├── cam1.yaml
    └── ...

For recording and calibrating your cameras, we highly recommend the JARVIS Toolbox.

2. Ensure Calibration Files are Correct

3. Running the Toolbox

Launch the GUI:

athena
  1. Select Main Folder and Recordings
    • Click on the “Select Folder” button to choose your main folder containing the videos and calibration directories.
    • A new window will appear, allowing you to select specific recordings to process. Select the desired recordings and click “Select”.
  2. Configure Processing Options
    • General Settings:
      • Fraction of Frames to Process: Slide to select the fraction of frames you want to process (from 0 to 1). Default is 1.0 (process all frames).
      • Number of Parallel Processes: Choose the number of processes for parallel processing. Default is the number of available CPU cores.
      • GPU Processing: Check this option to enable GPU acceleration (requires compatible NVIDIA GPU).
    • MediaPipe Processing:
      • Run Mediapipe: Check this option to run MediaPipe for extracting 2D landmarks.
      • Save Images: Save images with landmarks drawn (can consume significant storage).
      • Save Video: Save videos with landmarks overlaid (requires “Save Images” to be enabled).
      • Minimum Hand Detection & Tracking Confidence: Adjust the confidence threshold for hand detection (default is 0.9).
      • Minimum Pose Detection & Tracking Confidence: Adjust the confidence threshold for pose detection (default is 0.9).
    • Run Triangulation and Refinement:
      • Run Triangulation and Refinement: Check this option to perform 3D triangulation of landmarks and filtering.
      • All points: low-freq cutoff (Hz): Set the low-frequency cutoff for smoothing of all points (default is 10 Hz).
      • Save 3D Images: Save images of the 3D landmarks.
      • Save 3D Video: Save videos of the 3D landmarks.
      • Save Refined 2D Images: Save images of the refined 2D landmarks after triangulation.
      • Save Refined 2D Video: Save videos of the refined 2D landmarks
  3. Start Processing
    • Click the “GO” button to start processing.
    • A progress window will appear, showing the processing progress and average FPS.

4. Output Folder Structure

After processing, the toolbox will create additional directories within your main folder:

main_folder/
├── images/                # Contains images with landmarks drawn
├── imagesrefined/         # Contains refined images after triangulation
├── landmarks/             # Contains 2D and 3D landmark data
├── videos_processed/      # Contains processed videos with landmarks overlaid
└── ...                    # Original videos and calibration files

To create a single video that contains all videos (2D and 3D), you can use the ‘montage’ script and select the recording folder using the popup window:

python -m athena.montage

Troubleshooting

Contributing

Contributions are welcome! If you encounter issues or have suggestions for improvements, please create an issue or submit a pull request on GitHub.

Frequently Asked Questions (FAQ)

Q: Do I need a GPU to run the ATHENA Toolbox?
A: No, a GPU is not required and does not generally speed up processing, which is already highly parallelized. If you have a compatible NVIDIA GPU, you can enable GPU processing in the options. Ensure that your GPU drivers and CUDA toolkit are correctly installed.

Q: Can I process videos from only one camera?
A: Yes, you can process videos from a single camera to extract 2D landmarks. However, triangulation to 3D landmarks requires synchronized videos from at least two cameras and their corresponding calibration files.

Q: How do I obtain the calibration files?
A: Calibration files are generated using camera calibration techniques, often involving capturing images of a known pattern (like a chessboard) from different angles. You can use OpenCV’s calibration tools or other software to create these .yaml files. ATHENA accepts calibration files created by JARVIS or Anipose without any modification.

Q: The processing is slow. How can I speed it up?
A: You can increase the number of parallel processes if your CPU has more cores. Enabling GPU processing can also significantly speed up the landmark detection step. Additionally, processing a smaller fraction of frames can reduce computation time.

Q: Where can I find the output data after processing?
A: Processed data, images, and videos are saved in the images/, imagesrefined/, landmarks/, and videos_processed/ directories within your main folder.