ARS AI Innovation Fund - FY2022 Awards | SCINet | USDA Scientific Computing Initiative

The ARS AI Center of Excellence (AI-COE) funded five AI Innovation Fund proposals in FY2022. The program was again very competitive, with many more proposals submitted (>30) than we could support. Information about the funded projects is provided below.

Funded proposals

BeeMachine 2.0: development of machine learning approaches to rapidly identify bee museum specimens from digital images

PI and Co-PIs: Michael Branstetter, Brian Spiesman, Terry Griswold, Jonathan Koch
Amount of award: $52,555
Abstract: Bees have great value as pollinators of crops, natural plant communities, and backyard gardens. Identifying factors that affect bee diversity are therefore important for ensuring pollination services and food security. The global decline of bees, especially in agricultural regions, underscores the need for conservation research, including survey and monitoring efforts. A major impediment for studying bees, however, is that they are extremely diverse, with >4,000 North American species, and they are challenging to identify, especially by non-specialists. One potential solution to solve the identification impediment is the automated identification of bee species from digital images using machine learning and computer vision approaches. Such methods have been recently tested using field images of U.S. bumble bee species. The project was successful and the results were deployed in a web application called “BeeMachine.” Although promising, extending the approach to all bee species will be a major undertaking requring the generation of many more digital images of expertly identified specimens. We propose a collaboration between BeeMachine, the U.S. National Pollinating Insects Collection, and SCINet to improve the ability of deep learning models to automate bee identification from images. Focusing on three bee genera of agricultural importance (Bombus, Osmia, Megachile), we will generate digital images of 15,000 museum specimens representing 300+ species, and we will train a convolutional neural network using SCINet resources. We will use a subset of the images to validate the approach and check the accuracy of the model. We will also compare image-based identification to DNA-based methods by sequencing the barcode gene for a subset of specimens. This research will move the bee research community closer to an all-bee classifier and expand bee-identification tools. All images will be archived and results will be integrated online in the BeeMachine web app.

Geospatial Artificial Intelligence (GeoAI) for spatiotemporal modeling of pest insects

PI and Co-PIs: John Humphreys, Guofeng Cao, Bob Srygley, Dave Branson
Amount of award: $100,000
Abstract: Emerging GeoAI and deep-learning methods hold significant promise to revolutionize how agricultural problems are addressed, however the potential to adapt these methods to insect pest detection and management remains largely untapped and is often confounded by the statistical complexity of analyzing enormous quantities of multi-scale, multi-source data (Big Data) that exhibit complex spatiotemporal structure. We propose to leverage the SCINet high-performance computing infrastructure to aid in the development of new uncertainty-aware, GeoAI-based tools that employ neural network algorithms, deep learning methods, and related machine learning tools. These new tools will be used to characterize and model the complex spatiotemporal patterns connected to pest grasshopper outbreaks across the Western US. Grasshoppers are the most impactful rangeland insect pests in the US, cause rangeland forage losses of about $1.67 billion annually, and routinely inflict major crop damage. Grasshopper outbreaks in 2021 occurred at near record levels, caused serious economic injury in many states, and prompted massive public interest in agricultural pest management. For 2022, more than 80 million acres in the West are anticipated to be at economic risk with millions of dollars already requested for treatments and chemical control on federal lands. The combined use of multispectral imagery, derived environmental data sets, and our novel uncertainty-aware approach will enable the distinctive signature of grasshopper habitat to be identified while concurrently quantifying grasshopper density variation at the local scale to improve pest outbreak forecasts.

Modernizing Dietary Assessment: Adapting Deep Learning to Predict Ingredients from Food Photos

PI and Co-PIs: Danielle Lemay, Hamed Pirsiavash
Amount of award: $99,157
Abstract: Current dietary assessment methods are extremely burdensome for participants and contain substantial errors and biases. Artificial intelligence methods can be used to identify food from photo-based food diaries, which would enable convenient and real-time dietary data capture. Specifically, identification of individual foods and ingredients in photos are necessary for accurate dietary analysis and mapping to food and nutrient databases. Existing deep learning algorithms that predict recipes and ingredients from photos are promising but are trained on data that do not reflect eating patterns and food consumed “in the wild.” These methods need to be evaluated and adapted to predict ingredients from food photo diaries for dietary assessment. We propose to evaluate, compare, and adapt existing deep learning algorithms for prediction of ingredients from real-world food photos from our SNAPMe human nutrition study, which contains food photos paired with traditional food recalls (text-based) collected from healthy human participants. To improve prediction of food diaries from human studies, we will then fine-tune the most promising model on data with a variety of food types and food representations, including multi-cultural foods, core (single ingredient) foods, and food photos collected from human study participant smartphone cameras. This project will provide an algorithm for ingredient-level prediction from food photo diaries used for dietary assessment, and an evaluation of that algorithm on real-world benchmark data.

Microbial Water Quality Determinations using Machine Learning: Application and Algorithm Comparison

PI and Co-PIs: Yakov Pachepsky, Matthew Stocker
Amount of award: $100,000
Abstract: During irrigation events, dangerous microorganisms can be transported to crops and soils with contaminated waters. To assess the potential presence of pathogenic microorganisms in irrigation waters, the concentrations of the fecal indicator bacteria, Escherichia coli (E. coli), are routinely measured and are compared to regulatory standards. However, methods of enumerating E. coli are labor-intensive and costly and the long analysis time means results may not be representative of current conditions. For these reasons, researchers often elect to develop predictive models for E. coli concentrations based on the levels of physicochemical, hydrological, and meteorological variables, which can be rapidly and easily measured. However, due to the complexity of aquatic ecosystems and nonlinearity in relationships between predictor variables and E. coli in surface waters, traditional statistical approaches often struggle to effectively characterize the dependency of E. coli concentrations on the measured predictor variables. Recently, Machine Learning (ML) algorithms have been shown to provide timely and accurate predictions of the microbial quality of recreational and drinking water sources. Still, little work has been done to apply ML to irrigation water quality. The objectives set in this proposal are to 1) comprehensively evaluate over 100 ML algorithms for the prediction of E. coli in irrigation waters and 2) to determine the most important environmental variables governing the presence of E. coli in irrigation waters. We propose analyzing one of the largest microbial water quality databases in existence which was generated in the USDA-ARS Environmental Microbial and Food Safety Laboratory and contains over 95,000 unique measurements of E. coli and environmental variables collected at multiple field sites from 2016 to 2021. This research is expected to create powerful ML models which will be used to expedite and improve microbial water quality determinations and ultimately food safety in the United States and abroad.

Putting AI technology into the hands of farmers: Developing an app to make intelligent decisions using deep learning

PI and Co-PIs: Zhanyou Xu, Jo Heuschele, Zhou Zhang
Amount of award: $100,000
Abstract: Alfalfa is the third most valuable field crop after corn and soybean in the U.S., valued at about $10 billion annually. Climate, soil, and genetics affect alfalfa’s digestibility and biomass yield differently, with the paradox of high yield but low digestibility or high digestibility but low yield. Finding the optimal harvest window for desired yield and digestibility is one of the most critical decisions alfalfa farmers must make four to seven times every year. Low digestibility in alfalfa limits dry matter intake and energy availability in ruminant (dairy and beef) production systems, while low biomass yield reduces farmers’ profitability. Measuring forage digestibility is labor and resource-intensive, often with long turnaround times to obtain data. Computer vision-based deep learning analysis provides the opportunity to quickly predict digestibility and yield to help farmers and researchers make intelligent decisions. The goal of this proposal is to develop a mobile application decision tool (a cell phone app) using three convolutional neural networks (CNNs), Google Net, VGG-Block, and Reset with Google deep learning system “TensorFlow.” The app will have the capacity to 1) estimate alfalfa digestibility and yield in real-time; 2) measure the stems and plants per square foot to forecast yield potential and nitrogen credits to support decisions on stand termination; and 3) predict the harvest window for yield and quality goals taking into the account historical weather patterns and weather forecasts. Ultimately, the app will help farmers make AI-based decisions for increased economic returns for alfalfa production. The expected product of this proposal, a beta testing version 0.5, will be available for ARS internal testing in 2022. The farmer version 1.0 of the app will be launched as a free download for use by the end of 2023 in the Apple Store and 2024 in the Google Play Store.