Paper
in
Workshop: The 6th International Workshop and Prize Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture in conjunction with IEEE CVPR 2025
Maize ear sensing for on-farm yield predictions
Pedro Cisdeli · Gustavo Santiago · German Mandrini · Ignacio Ciampitti
In maize (Zea mays L.), early yield prediction is a concept of great interest to breeders and agronomists. Limited studies have tried to leverage field-scale imaging data to perform such crop yield predictions. For this purpose, this study aims to provide a complete data pipeline that uses single-view depth and RGB image data to extract morphological traits of maize ears (length, width, and volume), to forecast yield in a non-destructive approach. A depth camera from a smartphone device was used to acquire imagery data, and those images were used to train the YOLOv12n-seg model for segmenting the maize ears. The segmentation masks were then used to cut out the background from the point clouds. These point clouds then underwent a further filtering process to remove the remaining noise. The length and width of the ear were extracted using a geometric computational approach, while the volume was predicted using the deep learning model developed for this project, the Ear Volume Network (EVNet). Lastly, a yield prediction was obtained by using the ear morphological traits as input in the gradient-boosting decision trees. The segmentation task performed well under adverse environmental conditions such as occlusion, noise and variable lighting. The EVNet accurately predicted ear volumes under ideal scenarios (RMSE = 28.91 ml). Likewise, the yield forecast step demonstrated high accuracy in both ideal (RMSE = 13.90 g) and real (RMSE = 24.12 g) scenarios. The results of this study highlight the potential of using new technologies to predict yield under field conditions.