图|Gu Weihao attended the China Electric Vehicles 100 People Forum (2024)
This China Electric Vehicle Hundreds Forum (2024), with the theme of "Consolidating and Expanding the Development Advantages of New Energy Vehicles", will be held in Beijing from March 15th to March 17th. Wu Hequan, academician of the Chinese Academy of Engineering, Ouyang Minggao, vice chairman of the China Electric Vehicles Association of 100 and academician of the Chinese Academy of Sciences, and other important guests from government departments, research institutions, and enterprises gathered together to focus on the trend of industrial change, explore new paths for industrial development, and jointly contribute to the new era. Provide suggestions for the prosperity and development of the energy vehicle industry.
In recent years, global technology competition has become increasingly fierce. The advent of ChatGPT and Sora has triggered the emergence of large domestic models. Intelligent driving is also becoming the protagonist of the second half of the automotive revolution. Haimo takes the lead in laying out the autonomous driving 3.0 era represented by large models, large computing power, and big data, building new productive forces for smart cars, and hopes to use technology to promote the progress of the industry.
Gu Weihao believes that end-to-end autonomous driving is an important technological direction in the future, but it will take several years to arrive. Therefore, the past few years have been a process from dispersion to aggregation, perception models, cognitive models, and control models are brought together, and from dispersion to aggregation.
After Haomo released the industry's first self-driving generative large model DriveGPT Xuehu Hairuo, it has been firmly investing in the research and development and innovation of large model technology, in terms of data screening and mining, automatic annotation, generative simulation, and cognitive interpretability. Significant breakthroughs and innovations have been achieved.
Gu Weihao explained that compared with the traditional modular framework mainly used in the 2.0 era, the technical framework of the 3.0 era will undergo disruptive changes.
-
First of all, autonomous driving will achieve breakthroughs in the capabilities of large perception models and large cognitive models in the cloud, and gradually unify various small models on the vehicle side into perception models and cognitive models, while also converting control modules into AI models.
-
Secondly, the evolution route of the vehicle-side intelligent driving system will also be gradually modeled in the full link on the one hand, and large-scale models on the other hand, that is, small models will gradually be unified into large models.
-
Third, large models in the cloud can also gradually improve the perception capabilities of the car through pruning, distillation, etc. In places with good communication environments, large models can even achieve remote control of the car through car-cloud collaboration.
-
Finally, in the future, both the car and the cloud will be end-to-end autonomous driving models.
In the perception stage,DriveGPT first realizes the learning of the real physical world by building a large visual perception model, modeling the real world into a three-dimensional space, and adding time series to form a 4D vector space; then, on the basis of building a 4D perception of the real physical world, Haimo further introduces the open source multi-modal model of graphics and text to build a more general semantic perception model to realize the integration of multi-modal information of text, graphics and video, thereby completing the alignment of 4D vector space to semantic space and achieving the same goal as human beings. The same ability to "identify all things".
In the cognitive stage,Based on the "everything recognition" capability provided by the general semantic perception large model, DriveGPT describes the driving environment and driving intentions by building a driving language (Drive Language), combined with navigation guidance information and historical actions of the vehicle, using the massive external large language model LLM knowledge to assist in making driving decisions.
Since the large language model has learned and compressed all the knowledge of human society, it also includes driving-related knowledge. Haimo has specially trained and fine-tuned the large language model to better adapt to the autonomous driving task, so that the large language model can truly understand the driving environment, explain driving behavior, and make driving decisions.
By combining the large cognitive model with the large language model, autonomous driving cognitive decision-making acquires the common sense and reasoning capabilities of human society, that is, world knowledge, thereby improving the interpretability and generalization of the autonomous driving strategy.
During the speech, Gu Weihao shared the mass production results of Haimo’s products. At present, Haomo has launched seven HPilot passenger car intelligent driving products with the ultimate cost-effectiveness, which can meet the mass production needs of different models in high, medium and low price ranges. Among them, three thousand-yuan assisted driving products, HP170, HP370 and HP570, have been gradually delivered.As of March 2024, HPilot is equipped with more than 20 vehicles, and the user-assisted driving mileage has exceeded 130 million kilometers.
Recently, in the latest evaluation of the world's authoritative autonomous driving nuScenes competition, HaoMo won the first place in the purely visual 3D target detection task (nuScenes Detection task) without external data list, and improved the key indicator nuScenes Detection Score (NDS) to 68.8%.
Gu Weihao said: "We hope to join hands with colleagues in the industry and use the most leading technologies and products to help the development and growth of the Chinese and global automobile industries."