Announcement on the release of the 2024 annual project guide for the major research plan for next-generation artificial intelligence methods that is explainable and generalizable

Program ID:

202403210003

Internal Submission Deadline:

Apr 16, 2024

Submission Deadline:

April 22, 2024

Eligibility:

Agency Name:

Funding Level:

Award Size:

Subject Areas:

Related Documents (If available, links will be clickable below for download.)

Description

Announcement on the release of the 2024 annual project guide for the major research plan for next-generation artificial intelligence methods that is explainable and generalizable

A major research plan for the next generation of artificial intelligence methods that is explainable and generalizable 2024 Project Guide

  The major research plan for next-generation artificial intelligence methods that are explainable and generalizable is oriented to the country’s major strategic needs for the development of artificial intelligence. With the basic scientific issues of artificial intelligence as the core, it develops new artificial intelligence method systems and promotes basic research and talent cultivation of artificial intelligence in our country. Support my country’s leading position in a new round of international scientific and technological competition.

  1. Scientific objectives

  This major research plan focuses on basic scientific issues such as poor robustness, poor interpretability, and strong dependence on data in artificial intelligence methods represented by deep learning. It aims to explore the basic principles of machine learning and develop next-generation artificial intelligence methods that are explainable and universal. intelligent methods and promote innovative applications of artificial intelligence methods in science.

  2. Core scientific issues

  This major research plan focuses on the basic scientific issues of next-generation artificial intelligence methods that are explainable and generalizable and focuses on the following three core scientific issues.

  (1) Basic principles of deep learning.

  Deeply explore the dependence of deep learning models on hyperparameters, understand the working principles behind deep learning, and establish the approximation theory of deep learning methods, generalization error analysis theory, and convergence theory of optimization algorithms.

  (2) Next-generation artificial intelligence methods that are explainable and universal.

  Through the combination of rules and learning, we can establish new artificial intelligence methods that are highly accurate, interpretable, universal, and do not rely on large amounts of labeled data. Develop the database and model training platform required for the next generation of artificial intelligence methods, and improve the infrastructure driven by the next generation of artificial intelligence methods.

  (3) Application of next-generation artificial intelligence methods in scientific fields.

  Develop new physical models and algorithms, build open source scientific databases, knowledge bases, physical model libraries and algorithm libraries, and promote the exemplary application of new artificial intelligence methods in solving complex problems in the scientific field.

  3. Funded research directions in 2024

  (1) Cultivation projects.

  Focusing on the above-mentioned scientific issues and guided by the overall scientific goal, application projects with strong exploratory nature, novel topic selection and good preliminary research foundation will be funded in the form of cultivation projects. The research directions are as follows:

  1. New architectures of neural networks and new pre-training or self-supervised learning methods.

  Develop new more efficient neural network architectures, pre-training or self-supervised learning methods for data such as images, videos, graphs, flow fields, etc., and verify them on real data sets.

  2. Representation theory and generalization theory of deep learning.

  Study the generalization error analysis theory, robustness and stability of convolutional neural networks (and other networks with symmetry), graph neural networks, recurrent neural networks, low-precision neural networks, dynamic neural networks, generative diffusion models and other models theory and verify it on actual data sets; study the theoretical basis of unsupervised representation learning, pre-training-fine-tuning paradigm and other methods, develop new generalization analysis methods, and guide the design of deep learning models and algorithms.

  3. Theoretical basis of deep learning training algorithm.

  Study the structure of neural network loss landscape and the characteristics of training algorithms, including but not limited to: distribution of critical points and their embedded structures, connectivity of minimum points, edge of stability and loss spike phenomena , the implicit regularization, stability and convergence of the algorithm; the dependence of the training process on hyperparameters, neural network memory disaster, training time complexity analysis and other issues; the development of convolutional networks, Transformer networks, diffusion models, hybrid experts A training method with faster convergence speed and lower time complexity for models such as models.

  4. Basic issues of large models.

  Research basic issues of multi-tasking, multi-data, and large models, including but not limited to representation theory and generalization theory of large models, stability of large model training, scaling law, emergence and other phenomena; research new (structured) ) The basic properties of the state model, including whether it has the difficulty of curse of memory; understanding the expression and generalization capabilities of the Transformer model, In-Context learning, the effectiveness of Chain of Thoughts reasoning, and the model Extrapolation capabilities (such as length generalization), etc.

  5. Differential equations and machine learning.

  Research probabilistic machine learning methods for solving forward and inverse problems of differential equations and operator approximation; physical field generation, simulation and completion framework based on generative diffusion probability models; design new machine learning models and network structures based on differential equations to accelerate models reasoning and analysis of the training process of neural networks.

  6. New methods of graph neural networks.

  Use mathematical theories such as random walk, polynomial approximation, harmonic analysis, and particle equations to solve problems such as over-smoothness, over-squeezing, and mismatched graphs and dynamic graphs in deep graph neural networks; for drug design, recommendation systems, and multi-agent network collaboration Design effective, scalable, and interpretable graph representation learning methods for important application scenarios such as control.

  7. Security issues of artificial intelligence.

  Aiming at mainstream machine learning problems, develop privacy-preserving collaborative training and prediction methods; develop analysis, attack, defense and repair methods for adversarial samples, data poisoning, backdoors, etc.; study methods for machine learning frameworks to interfere, destroy and control models; develop Privacy calculation methods with controllable accuracy, and evaluation and rating methods for the fairness and reliability of data and models (including large models).

  8. Artificial intelligence methods in the field of scientific computing.

  For the electronic many-body problem, establish artificial intelligence methods for numerical calculation of the Schrödinger equation, first principles calculation, free energy calculation, coarse-grained molecular dynamics, etc., and explore the application of artificial intelligence methods in the research of batteries, electrocatalysis, alloys, photovoltaics and other systems. applications in.

  Aiming at typical cross-scale problems and dynamic problems in the fields of physics, chemistry, materials, biology, combustion, etc., develop the integration method of physical models and artificial intelligence, explore the mining methods of physical relationships implicit in complex system variables, and the mathematics of structure-activity relationships. Express and establish universal cross-scale artificial intelligence-assisted computing theories and methods to solve typical complex multi-scale computing problems.

  9. Data-centric machine learning.

  Focusing on factors such as data quality, quantity and efficiency, develop machine learning methods to provide a large amount of high-quality data for downstream machine learning models; on the AI ​​for Science data side, research and design efficient scientific data (such as protein and drug mapping) construction and pre-production Processing method: For the data side of large models, based on data acquisition cost and efficiency, establish a scientific and systematic data quality assessment strategy, design efficient data selection methods, construct effective data matching methods, and explore large model-assisted data quality improvement methods (such as automatic data annotation).

  10. Machine learning algorithm based on quantum computing.

  Study how different types of learning methods map to general quantum processes, propose new algorithms that use quantum properties to achieve efficient learning; study the advantages of quantum machine learning in expression and generalization capabilities compared to classical machine learning methods, and explore the advantages of quantum machine learning Interpretability, establishing application scenarios of quantum machine learning in quantum physics and chemistry.

  11. Open projects.

  Methods related to explainable and generalizable next-generation artificial intelligence methods and AI for Science fields, focusing on supporting innovative topics in algorithms and models.

  (2) Key support projects.

  Focusing on the core scientific issues and guided by the overall scientific goals, it is planned to fund application projects with good accumulation of early research results in the form of key support projects, which can play a role in promoting the overall scientific goals in theory and key technologies, and have industry-university-research and application foundations. , the research directions are as follows:

  1. Next-generation artificial intelligence methods.

  Develop artificial intelligence methods that combine logical reasoning, knowledge and rules, and establish an interpretable and universal artificial intelligence theoretical framework; develop new methods suitable for continuous, dense data (such as images) and unstructured data (such as molecular structures) The neural network architecture effectively captures multi-dimensional contextual information such as space, structure, and semantics, and improves data modeling capabilities. Validated on real data sets.

  2. A new generation of brain-inspired artificial intelligence models and effective training algorithms.

  Based on the physical form and biophysical diversity of brain neurons, establish a simple and effective mapping relationship between biological neurons and artificial neurons, so that artificial neurons have the dendritic nonlinear integration and computing functions of biological neurons. It provides a unified theoretical and algorithmic framework for establishing mapping between other types of biological neurons and artificial neurons. Combining the characteristics of the brain’s neuron network connection structure, brain region heterogeneity and macroscopic gradients, an artificial neural network model with constraints on biological neuron characteristics is designed to realize advanced cognitive functions such as memory and decision-making. Realize effective mapping between no less than 3 types of biological neurons and artificial neurons and 3 important dendrite calculation functions. Compared with existing mappings, achieve improvements in accuracy, performance and parameter interpretability.

  3. Multi-agent collaborative learning theory and methods.

  In view of the challenges faced by distributed data processing when multi-agent collaboration occurs, such as lack of guarantee of generalization performance, weak adaptability and scalability, etc., research on efficient multi-agent collaborative learning theories and methods, including: (1 ) Research algorithms to improve the generalization performance of multi-agent collaborative learning systems and analyze the generalization error bound; (2) Research the adaptability and scalability of multi-agent systems in response to dynamically changing environments and expanding network scales to ensure Agents can learn effectively and collaborate efficiently; (3) Methods to process and fuse multi-modal data (such as text, images and sensor data) in multi-agent systems to enhance learning effects and improve decision-making quality; (4) Research Collaborative learning and decision-making strategies in real-time or near-real-time environments, focusing on emergency response and key decisions in dynamic and uncertain environments; (5) Explore personalized learning strategies for agents to effectively conduct collective learning while maintaining individual advantages and knowledge sharing.

  4. Multi-modal fusion and basic model generation.

  Study the basic model of multi-modal data fusion and generation to solve fusion problems caused by different data perspectives, dimensions, density, acquisition and annotation difficulties; study the modal alignment problem in the modal fusion process to ensure the accuracy of modal prediction Consistency and reduce information loss during the fusion process; study lightweight fusion models to improve the robustness of fusion models under imperfect alignment between modalities; study how to use easy-to-collect and easy-to-label modal data to guide difficult-to-collect, Pre-training and fine-tuning methods for difficult-to-label modal data; study pre-training issues for large-scale multi-task and multi-modal learning, realize few-sample/zero-sample transfer, and develop methods for cross-modal diversity data generation; study multi-modality A new, unified probabilistic modeling method for large models solves the problem of probabilistic modeling and generation of discrete and continuous mixed data types, and improves the generation efficiency of multi-modal basic models. Achieve the representation learning, alignment and generation capabilities of no less than 3 modalities in the multi-modal model, with no less than 7B model parameters, and explore application verification in fields such as smart cockpits, autonomous driving or multi-modal dialogue. Accumulate high-quality annotated data for training multi-modal large models, explore data closed loops, and collect non-perfect annotated or unlabeled data that exceeds the annotated samples by at least 2 orders of magnitude to achieve model iterative optimization.

  5. Large model training method for model and data fusion.

  Explore systematic and adaptive data selection methods to achieve the organic integration of data and models, including: on-the-fly methods of selecting data to be used in the next step during model training; establishing machines for the organic integration of data and models Learning framework; develop an effective method to replace the two-step model commonly used in large model training, which is to first process the data and then do model training.

  6. Video-native self-supervised learning method.

  In view of the fact that video data is both a time series and an image, but is different from general time series and images, we make full use of the attributes and characteristics of video data to develop a new self-supervised learning framework, which is analogous to predict next token for time series. framework and a fill-in-the-blank framework for image data, and are verified on actual video data sets.

  7. A universal, high-quality scientific database that supports the next generation of artificial intelligence.

  Large-scale, high-quality scientific data is a necessary condition for the new paradigm of scientific research driven by artificial intelligence. Research active learning mechanisms and automatic association algorithms in the labeling, extraction, and fusion of knowledge objects such as scientific data and scientific documents; research intelligent coding and machine-readable multi-analysis for knowledge objects to support broad-spectrum correlations of cross-domain knowledge objects , realize dynamic interoperability with no less than 3 international mainstream scientific and technological resource identifiers, and support intelligent integration with external data resources; research multi-modal interdisciplinary knowledge fragment alignment and knowledge object identification methods, as well as automatic production and enhancement algorithms for data in multi-disciplinary fields , forming more than 1PB of high-quality scientific data that complies with international standards or has been peer-reviewed and covers no less than 8 subject areas.

  8. Infrastructure construction and demonstration application of AI for Science.

  Develop infrastructure methods for AI for Science, including: artificial intelligence algorithms for basic physical models; high-efficiency, high-precision experimental characterization algorithms; construction of automated and intelligent experimental platforms; integration and intelligent application of scientific literature and scientific data. Develop innovative applications of AI for Science, including but not limited to: complex catalytic systems (dynamic structural changes of catalysts, highly complex reaction networks, etc.); core catalytic reactions in carbon peaking and carbon neutralization; electrochemistry under working conditions Characterization methods; high-efficiency and high-precision imaging technology in biomedicine; automated and intelligent solutions for organic synthesis; directed evolution protein engineering, etc. Focus on supporting projects that combine theory and experiment to form a closed loop.

  4. Basic principles for project selection

  (1) Closely focus on core scientific issues, encourage basic and cross-cutting frontier exploration, and give priority to original research.

  (2) Prioritize support for research projects that are geared towards the development of next-generation new artificial intelligence methods or that can promote the application of new artificial intelligence methods in the scientific field.

  (3) Key support projects should have a good research foundation and early accumulation, and have direct contributions and support to the overall scientific goals.

  5. Funding plan for 2024

  Approximately 25 cultivation projects are planned to be funded. The direct cost support intensity shall not exceed 800,000 yuan per project. The funding period is 3 years. The research period in the cultivation project application form should be filled in “January 1, 2025 – December 31, 2027”. ; About 6 key support projects are planned to be funded. The direct cost funding intensity is about 3 million yuan per project. The funding period is 4 years. The research period in the application form for key support projects should be filled in “January 1, 2025 – December 2028.” 31st.”

  6. Application requirements and precautions

  (1) Application conditions.

  Applicants for this major research plan project should meet the following conditions:

  1. Have experience in undertaking basic research projects;

  2. Have senior professional and technical positions (professional titles).

  Postdoctoral researchers at the station, those who are pursuing a graduate degree, and those who do not have an employer or whose unit is not a supporting unit are not allowed to apply as applicants.

  (2) Regulations on limited applications.

  Implement the relevant requirements of the limited application provisions in the “Application Regulations” of the “2024 National Natural Science Foundation of China Project Guide”.

  (3) Things to note when applying.

  Applicants and supporting institutions should carefully read and implement the relevant requirements in this project guide, the “2024 National Natural Science Foundation of China Project Guide” and the “Notice on 2024 National Natural Science Foundation of China Project Application and Finalization and Other Related Matters”.

  1. This major research plan project implements paperless application. The application submission date is from April 15, 2024 to 16:00 on April 22.

  (1) Applicants should fill in and submit the electronic application form and attached materials online in accordance with the instructions and writing outline requirements for major research plan projects in the Science Fund Network Information System.

  (2) This major research plan aims to closely focus on core scientific issues, strategically guide and integrate the advantages of multi-disciplinary related research, and form a project cluster. Applicants should formulate their own project name, scientific objectives, research content, technical routes and corresponding research funds based on the core scientific issues to be solved by this major research plan and the research directions to be funded as announced in the project guide.

  (3) Select “Major Research Plan” for the funding category in the application, select “Cultivation Project” or “Key Support Project” for the subcategory description, select “Explainable and Universal Next-Generation Artificial Intelligence Methods” for the annotation, and accept the code Select T01 and select no more than 5 application codes based on the specific research content of the application. The number of cooperative research units for cultivation projects and key support projects shall not exceed 2.

  (4) The applicant should clearly state in the beginning of the application that the application is in line with the funded research direction in this project guide, and its contribution to solving the core scientific issues of this major research plan and achieving the scientific goals of this major research plan.

  If the applicant has already undertaken other science and technology projects related to this major research plan, the differences and connections between the applied project and other related projects should be discussed in the “Research Basis and Working Conditions” section of the main body of the application.

  2. The host unit shall complete the host unit’s commitment, organize the application, and review the application materials as required. Submit the unit’s electronic application and attachment materials through the information system item by item before 16:00 on April 22, 2024, and submit the unit’s project application list online before 16:00 on April 23.

  3. Other precautions.

  (1) In order to achieve the overall scientific goals and multidisciplinary integration of major research plans, the project leader who receives funding should promise to abide by relevant data and information management and sharing regulations. During the project implementation process, attention should be paid to the relationship with other projects of this major research plan. mutually supportive relationship.

  (2) In order to strengthen the academic exchanges of the project, promote the formation of project groups and multi-disciplinary intersection and integration, this major research plan will hold an annual academic exchange meeting for funded projects every year, and will organize academic seminars in related fields from time to time. meeting. The person in charge of the funded project is obliged to participate in the above-mentioned academic exchange activities organized by the guidance expert group and management working group of this major research plan.

  (4) Consultation methods.

   Department of Interdisciplinary Sciences, Department of Interdisciplinary Sciences

  Contact number: 010-62328382