Development and Tests of a 3 D Fish-Tracking Videometry System for an Experimental Flume

To design effective and efficient fish passage facilities at hydropower plants, the knowledge of swim behaviour of fish is essential. Therefore, living wild fish were investigated at different fish guidance structures in an experimental flume in a test section of 11 m length and 2.5 m width at water depths of about 0.6 m. Besides analysis of time data and manual recordings of the fish behaviour, video recordings of the fish movements can allow more detailed analysis of fish behaviour in different hydraulic situations. Thus, a videometry system was installed consisting of eleven synchronous cameras with overlapping views lined-up under dry conditions outside the flume. A 3D tracking algorithm was developed and implemented to analyse the video data. Core of the code is a motion-based multiple object tracking method, in which several objects can be tracked in 2D pixel-frame coordinates at the same time. After undistorting and stereocalibrating the cameras, the 2D tracks are transferred to a 3D metric-space according to their epipolar geometry. Within this paper video data from a single experimental run of 15 min with three fishes with lengths of 100– 150 mm are analysed exemplarily. The path-time diagram gives a distinct ‘big picture’ of the fish movement, which helps to identify preferred and disliked regions. However, due to imperfect actual camera setup, a 3D view in the near field of the cameras and an automated separation of individual tracks in a group of fish remains challenging.


Introduction
The knowledge of fish behaviour in the near field of fish passage facilities is of importance for both the evaluation of their effectiveness and their optimization.However, up to now, fish observations gained by open flume experiments have been mostly qualitative, since manual recording of fish movements or manual video assessment are extremely time consuming and statistically evaluable data sets of fish behaviour are rare.
Alternative approaches to record fish movements are transponder techniques, as often applied in field experiments, or videometry which is especially suitable for laboratory experiments under clear water conditions.The latter has already been used in laboratory flumes related to both upstream (e.g., [1]) and downstream (e.g., [2]) fish passage experiments in order to get 2D fish-tracks in scales of several meters and minutes.Also 3D fish-tracks have been computed out of experimental videometry under laboratory conditions (e.g., [3,4]) and in the field ( [5]).However, these systems applied a complex stereo camera setup that focussed on volumes with edge lengths on centimetre scale.Those experimental setups are not able to adequately record the behaviour in spatial dimensions relevant for fish in in the near field of fish passage facilities.
Since April 2016 flume experiments with fish are conducted at BAW (German Federal Waterways Engineering and Research Institute, Karlsruhe) investigating different configurations of fishway elements.It is a joint project of BAW and BfG (German Federal Institute of Hydrology, Koblenz), with partial support for the fish-tracking system from Laboratory of Hydraulics, Hydrology and Glaciology (VAW, Zürich).In this study we develop a first approach to obtain long-term fish-tracks in 3D on meter scales by a 3D fishtracking velocimetry system installed in an experimental flume.The associated fish-tracking software comprises camera calibration and deriving 3D fish-tracks, based on the computing environment of MATLAB 2017a.

Experimental Setup
For experiments presented in the following the effective test section of the flume had a length of 11 m and a width of 2.5 m at water depths of ~0.6 m.During each experimental run the fish behaviour was documented by manual time measurements (stopwatch pressed if certain target-lines were reached by the fish) and clipboard records of biologists of BfG.More details can be found in [6] and [7].
The 'eyes' of the videometry system consisted of eleven cameras arranged in series on the flow-right side outside the flume (Fig. 1), each with a distance of 1.02 m.This simple and dry installation provided a quick access to the cameras during the initial measurement campaign in 2016 with the possibility of water damage being low.Each area scan camera of type acA2000-50gmNIR (Basler) had been equipped with a 185° fisheye lens of type FE185C086HA-1 (Fujifilm).A GigE Vision 2.0 network with a Precision Time Protocol (PTP) IEEE1588 provided synchronous measurements with frame rates kept almost constant at ~10 fps.

Camera Calibration
Camera calibration works in three steps: find the intrinsic and extrinsic parameters for each of the eleven cameras, then calibrate ten stereo camera systems according to the overlapping views of camera pairs #1-2, #2-3, … #10-11, and finally perform a rigid transformation of all stereo camera pairs to a global flume coordinate system.
The calibration of each of the eleven cameras was based on 14×6 crossing points detected on a checkerboard with square sizes of 46 mm (Fig. 2).Image frames with checkerboard recordings at different angles and distances to the cameras were automatically preselected from the video.Checkerboard crossing points were received by standard image processing techniques applied to these preselected video frames with their edges enhanced by an edgeaware, fast local Laplacian filter.The recorded images are distorted in a complex manner due to both the fisheye lens and the refraction at the interfaces air-glass (i.e. the flume wall window) and glass-water.Thus, a distortion-correction by the simple mathematical frame camera model as implemented in MATLAB failed.OpenCV 3.1 libraries were included in the calibration software and used instead of standard MATLAB functions.They use a rational frame camera model, which gave satisfactory results.Apart from the regions near the image boundaries with its extreme distortion effects the rectification results appear to be satisfactory.Stereo calibration was performed on the basis of corresponding checkerboard crossing point pairs of each two camera views as detected and undistorted during the single camera calibration.The selection process of synchronously recorded image frame pairs was automatized as well.MATLAB standard frame camera model was applied to the rectified images to estimate the parameters for a stereo calibration.Fig. 3 exemplarily visualizes the obtained extrinsic parameters of a stereo camera system in its local coordinate system of [X, Y, Z] (mm).The precision of the calibration suffers from an imperfect undistortion of the single images.Therefore, the two cameras presented in Fig. 3 do not appear perfectly aligned.However, for the current data analysis, the obtained precision level was considered to be acceptable.Fig. 3. Exemplary plot of a stereo-calibrated camera pair in its local metric coordinate system with its origin (0, 0, 0) at the focal point of camera #1 (in blue).The rectangular areas belong to the used positions and orientation of the calibration chessboard at different image frames.
In the last step of image calibration each of the arbitrary local stereo camera coordinate systems has to be shifted and rotated by a rigid transformation to a global flume coordinate system.Here, x is pointing in streamwise flow direction, z is pointing vertically towards the water surface with z = 0 at the flume bottom, and y is accordingly defined by a right hand rule.To this end the flume locations of eleven fixed reference points were measured, and their coordinates were manually digitized pairwise on the still images of the stereo camera pairs.Together with the known locations of the cameras these data set gave the basis for the rigid transformation.A backward computation of several calibration videos showed that reference points in stereo camera pairs are matched by a mean error of about 45 ±30 mm.

3D Fish-Tracking
Detection of moving objects and a subsequent motion-based tracking are classical challenges of videometry Motion-based object tracking mainly consists of two parts: detecting moving objects in each frame and then associating the detections corresponding to the same object over time.
In the current study, all moving objects including fish are detected on each frame by using the MOG2 background subtraction algorithm of OpenCV 3.1, which is based on a Gaussian mixture model.Morphological operations are applied to the resulting binary foreground mask to eliminate noise.Finally, a binary large object (blob) analysis detects groups of connected pixels, which are likely to correspond to moving fish.
In MATLAB, the association of detections to the same blob is based solely on motion.A Kalman filter is used to predict the object's location in each frame, and to determine the likelihood of its detection being assigned to each track.Track maintenance is an important aspect for determining the swimming path of a fish.In any given frame, some detections may be assigned to tracks, while other detections and tracks may remain unassigned.The assigned tracks are updated using the corresponding detections, and the unassigned tracks are marked invisible.An unassigned detection begins a new track.Each track keeps count of the number of consecutive frames, where it remains unassigned.If the count exceeds a threshold of five time steps, it is assumed that the object that might be a fish left the field of view and the track is not further considered in the forthcoming time steps.
The intermediate results of motion-based tracking are tracks in a distorted and uncalibrated 2D image frame coordinate system.Fig. 4 gives an example, where noise objects caused by reflections at the glass window and three fish have been detected.After undistorting and stereo-calibrating the cameras, the 2D tracks are transferred to a 3D metric-space according to their epipolar geometry (see [8]), i.e., based on the camera parameters derived during the calibration.In this way, the required computation time is optimized -in contrast to a procedure in which each image frame would be undistorted first, and then processed by the tracking algorithm.
Figs. 5-7 show exemplary views of the 3D fish-tracks obtained from a single experimental run of 15 min, where the behavior of three fish with lengths of 100-150 mm were tested.
The top view given in Fig. 5 indicates, that the fish were mainly present in the backward area of the flume (defined from a point of view of the cameras).Fish were mainly orienting at the back wall of the flume, which was a horizontal bar screen in this case.Thus, first indication is given, that the videometry system is capable of identifying zones that are preferred by the fish.However, a main shortcoming of the chosen camera installation becomes obvious: Due to a lack of overlapping stereo camera views near the glass wall a computation of 3D tracks is impossible there.That means that the videometry system, in its configuration used here, is blind in terms of a 3D view within the near field of 500-750 mm distance to the cameras.
The side view in Fig. 6 shows that the fish were swimming close to the ground.Finally, the path-time diagram plotted in Fig. 7 gives a distinct 'big picture' of the fish movement of the group of fish consisting of three individuals.However, due to an imperfect actual camera setup, a fully automated separation of individual tracks remains challenging and, for the camera setup used, only the midpoint of a group of fish can be detected.
To sum up, the presented exemplary analysis shows that despite some shortcomings the developed 3D fish-tracking videometry systems has potential to identify regions that are preferred or disliked by fish.

Concluding Remarks
This paper reports on the development and first application tests of a 3D fish-tracking velocimetry system installed in an experimental flume, where the behaviour of wild fish at different fishway construction elements can be investigated.The computed tracks as depicted by a path-time diagram give a distinct 'big picture' of the fish movement, which helps to identify preferred and disliked regions with a high level of detail in time and space.The presented exemplary analysis shows that the developed 3D fish-tracking videometry systems has, beside some shortcomings, potential to obtain long-term fish-tracks in 3D in spatial dimensions of several meters.
For the initial measurements, a dry installation of the cameras lined-up in series outside the flume was chosen.The main reason was the advantage of an easy accessibility of the cameras and their optics as well as a low risk for the cameras to sustain water-damage during the experiments.On the negative side, however, strong refraction effects at the boundaries of air, glass, and water occurred.In turn, this required a complex process of image rectification with remaining falsified results in the boundary areas of undistorted image frames.In addition, this installation finally followed in larger volumes without overlapping stereo camera views within the near field of 500-750 mm distance to the cameras clearly caused by refraction effects at the glass wall.
Furthermore, the lateral positioning of the cameras caused the fish to appear with a larger variability in its pixel size, with an apparent small fish in the far field with distances of 2.5 m and an apparent larger fish in the near field of the camera with distances of ~500 mm (see also: https://www.youtube.com/watch?v=iZhEcRrMA-M).Both situations make a precise object detection of fish including a filter assuming an almost constant pixel-area size more challenging.However, the experiments indicated that the fish kept most of the time close to the ground or at least in the lower third of the water body of ~0.6 m.Thus, a set up with cameras slightly submerged from above and pointing vertical to the ground may overcome these shortcomings, especially if a glass dome in front of the fisheye lenses instead of a plain window is used.As long as the behaviour of the fish and the flow is not significantly affected by such a camera constellation it will markedly optimize the results of 3D fish tracking.

Fig. 2 .
Fig. 2. Distortion-corrected image frame with detected checkerboard corner points.For a perfect rectification, straight lines in real world should become straight lines in the image.However, near the image boundaries with its extreme distortion the rectification results are deflected.

Fig. 4 .
Fig. 4. Exemplary sequence of two images with a time distance of 8 frames, where three fish and noise objects (lower right corner) are detected, left: raw image with tracked and confirmed blob highlighted in yellow; right: binary image.(a) frame #i, with i = number of this frame, (b) frame #i+8.

Fig. 5 .
Fig. 5. Topview to 3D fish-tracks, where color-coded points refer to single fish tracks.Bold points = 3D positions of fish, blue circles = positions of the cameras, main flow direction = black arrow.The horizontal bar screen (Fig. 1) physically prevents fish to swim to the area marked by the dashed line.

Fig. 7 .
Fig. 7. Path-time diagram.Additional small light gray points give estimates out of 2D tracks without overlapping camera view.For further caption see Fig.5.