Artificial Intelligence of Multi-modality Group (AIM Group)

多模态人工智能实验室

AIM Group

Artificial Intelligence of Multi-modality Group

多模态人工智能实验室

人工智能作为一项能够引领未来的战略性赋能技术，正在驱动着新一轮科技革命和产业变革。多模态人工智能实验室（Artificial Intelligence of Multi-modality Group, AIM Group）负责人为河海大学计算机与信息学院刘凡教授，主要关注计算机视觉、机器学习、模式识别、多模态深度学习等人工智能领域关键课题的研究。同时，AIM 实验室依托河海大学特色优势，围绕“人工智能+智慧水利”交叉研究，基于计算机科学与水利、土木、海洋、现代农业等传统优势学科交叉融合，促进水利行业从信息化向智能化的发展转变。目前，AIM 实验室的研究方向主要集中在但不限于以下几个方面：

遥感领域视觉-语言大规模多模态预训练
预训练基础模型的小样本下游泛化
无人机多模态环境感知与自主导引
基于领域知识的智能诊断问答系统
音乐驱动的指挥动作生成
数据驱动的水文时间序列预测
基于计算机视觉的大坝、桥梁监测
单样本多模态人脸识别与分析

实验室每年招收硕士/博士研究生，对 AIM 实验室研究方向感兴趣的同学欢迎发送个人简历至 fanliu@hhu.edu.cn，简要介绍相关经历并陈述研究兴趣。同时，AIM 实验室欢迎优秀河海大学在读本科生加入，实验室培养的往届本科生去向包括赴卡内基梅隆大学，香港科技大学，浙江大学等高校深造，或供职于华为、百度、京东、阿里、旷视等科技企业。

实验室动态

News

2025

2025-06-12 多模态人工智能工作室获批河海大学2025年优秀本科生导师工作室重点项目！
2025-06-3 刘凡教授获2025年全国高等学校计算机教育研究会优秀青年教师奖！
2024-05-26 我们提出了一种新的多任务统一范式，以指代表达分割为核心，无需额外任务头即可高效拓展到8种遥感视觉感知任务。在此范式基础上，我们构建了RemoteSAM模型，RemoteSAM以180M的参数量在多种遥感视觉任务上超越了LHRS-Bot、GeoChat等7B参数级模型的性能。论文的 arxiv 预印本和代码现已公开。
2025-05-21 刘凡教授获第五届江苏省高校创新大赛二等奖！
2025-04-23 我们的论文“Boost UAV-based Object Detection via Scale-Invariant Feature Disentanglement and Adversarial Learning”已被一区TOP期刊IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (IEEE TGRS)录用。恭喜姚亮！
2025-03-05 我们的论文“Power Line Aerial lmage Restoration under Adverse Weather:Datasets and Baselines”已被二区TOP期刊J-STARS录用。
2025-02-12 我们构造了一个无人机自主导引模拟系统AirNavigation，实现了算法无关的无人机飞行控制接口，并接入大语言模型以增强算法性能评估的灵活性。系统的介绍视频已公开。
2025-01-27 我们关于遥感视觉问答的论文“Co-LLaVA: Efficient Remote Sensing Visual Question Answering via Model Collaboration”已被二区期刊 Remote Sensing 录用。恭喜戴雯雯、朱佳乐！
2025-01-10 我们的论文“Collaborative semantic contrastive for all-in-one image restoration”已被二区TOP期刊 Engineering Applications of Artificial Intelligence 录用。

2024

2024-12-31 刘凡教授获第一届全国电子信息类专业高校教师智慧教学案例竞赛 (浩埔杯) 二等奖！
2024-12-28 刘凡教授获江苏省计算机学会教学新秀奖！
2024-12-21 我们关于模型轻量化的论文“RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification”被CCF-B类会议IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2025)录用，恭喜邹广文杰（本科生）、姚亮！
2024-12-21 我们关于语义轮廓提取的论文“Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images”被CCF-B类会议IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2025)录用，恭喜缪师宇（本科生）、陈德龙！
2024-12-10 我们关于小样本学习的论文“Making Large Vision Language Models to be Good Few-shot Learners”被CCF-A类会议The 39th Annual AAAI Conference on Artificial Intelligence(AAAI2025)录用，恭喜蔡雯雯、霍健！论文的预印本已公开。
2024-12-09 我们关于图像重建的论文“IPT-ILR: Image Pyramid Transformer Coupled with Information LossRegularization for All-in-one Image Restoration”已被一区TOP期刊 lEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 录用。
2024-11-23 我们的论文"RemoteCLlP:A Vision Language:FoundationModel for Remote Sensing"入选ESI高被引论文！
2024-11-06 我们关于小样本学习的论文“基于局部对比学习与新类特征生成的小样本图像分类”被CCF-B类期刊《模式识别与人工智能》录用，恭喜陈宁、董晨炜！
2024-11-04 我们的科研项目“复杂环境下小样本学习理论及关键技术”获江苏省信息技术应用学会科学技术奖（科技创新）二等奖！
2024-11-03 实验室学术会议记录：陈宁等人前往连云港参加江苏省研究生智能控制与海洋信息处理前沿学术创新论坛，并在会议上进行了口头报告分享我们关于小样本图像分类的文章。
江苏省研究生智能控制与海洋信息处理前沿学术创新论坛记录
2024-10-29 我们关于无人机目标检测知识蒸馏的论文“Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection”已被IEEE Geoscience and Remote Sensing Letters (IEEE GRSL) 期刊录用。恭喜姚亮，欧志权！
2024-10-18 我们的一系列论文投稿了2024年江苏省研究生智能控制与海洋信息处理前沿学术创新论坛，分别获评优秀论文一等奖、三等奖以及优秀墙报奖。恭喜陈宁、周晓聪、朱佳乐、戴雯雯！
2024-10-10 我们关于计算机教育的论文“基于认知诊断与大模型优化遗传算法的自动组卷方法”被中国计算机实践教育学术会议（CPEC2024）评为优秀论文一等奖！
2024-10-08 刘凡教授获第二届全国高校人工智能教师教学创意竞赛二等奖！
2024-10-04 我们的论文“A Encoder-Decoder Framework for Foundation Model-based Remote Sensing Semantic Segmentation”已被CCF-C类会议MMAsia 录用。恭喜朱佳乐！
2024-10-04 我们的论文“Feature-weighted Multi-stage Bayesian Prototype for Few-shot Classification”已被CCF-C类会议MMAsia 录用。恭喜周晓聪！
2024-09-26 我们的论文“Single lmage Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models”已被CCF-A类会议Conference and Workshop on Neural Information Processing Systems(NeurIPS2024) 录用！
2024-09-16 我们关于家禽疾病诊断的论文“Unifying Large Language Models and Knowledge Graphs for Poultry Diseases Diagnosis”已被第十八届中国生物特征识别大会录用。恭喜徐圣翔（本科生）！
2024-08-22 欧志权（本科生）的河海大学本科优秀毕业论文《分级递进的无人机目标检测蒸馏方法》被评为江苏省仪器仪表学会优秀本科毕业论文三等奖！恭喜！
2024-08-13 我们关于小样本多模态基础模型的论文“Few-shot Adaptation of Multi-modal Foundation Models: A Survey”已被二区TOP期刊 Artificial Intelligence Review 录用。恭喜张天舒、戴雯雯、蔡雯雯、周晓聪！
2024-08-11 恭喜本科生团队(沈逸骏，孙昊，郭子扬，曹书华，高翔宇)参加第十七届中国大学生计算机设计大赛获得全国二等奖！
2024-07-20 我们关于小样本图像识别的论文“Few-shot Classification Model Compression via School Learning”已被一区TOP期刊 lEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 录用。
2024-06-11 我们构建了一个仿真多模态无人机目标检测数据集UEMM-Air，这是已知现有成对模态数目最多的无人机目标检测数据集，包含可见光、深度、表面法线、分割以及无人机IMU参数5种模态。数据集的论文和获取方式已公开。
2024-06-04 恭喜本科生团队(沈逸骏，孙昊，郭子扬，曹书华，高翔宇)参加第十五届中国大学生服务外包创新创业大赛 (基于文心大模型的智能阅卷平台设计与开发）获得国家三等奖！
2024-05-28 实验室学术会议记录：姚亮前往土耳其伊斯坦布尔参加FG2024学术会议，并在会议上进行了聚光灯演讲 (Spotlight) 分享我们关于无人机人脸识别的文章。
FG2024学术会议记录
2024-05-22 我们首次将尺度不变特征解耦应用于无人机目标检测任务，设计了一种可用于任意FPN架构检测器的尺度不变特征对抗解耦模块SIFDAL，引入我们的模块后，单阶段无人机目标检测器精度可以获得有效提升。论文的 arxiv 预印本现已公开。
2024-05-08 恭喜姚亮获批江苏省研究生科研与实践创新计划项目！
2024-04-29 潘艳玲（本科生）的河海大学本科优秀毕业论文《跨语言医学知识图谱构建技术研究》被评为江苏省优秀毕业论文二等奖！恭喜！
2024-04-26 刘凡教授获第四届江苏省高校教师教学创新大赛二等奖！
2024-04-03 我们关于遥感视觉语言基础模型的论文"RemoteCLIP: A Vision Language Foundation Model for Visual Recognition of Earth Observations" 已被 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (中科院一区TOP期刊)录用，恭喜管张青云、周晓聪、朱佳乐！论文和代码现已公开。
2024-03-06 我们的论文"AerialFace: A Light Weight Framework for Unmanned Aerial Vehicle Face Recognition"已被CCF-C类会议The 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG2024) 录用。恭喜欧志权(本科生)、姚亮、吴婷！
2024-02-22 刘凡教授获2023年度江苏省自动化学会青年科技奖！

2023

2023-12-11 我们的论文《A Survey of Convolutional Neural Network: Analysis, Applications, and Prospects》成功入选2023年江苏省自然科学百篇优秀学术成果论文！
2023-11-08 我们关于大规模多模态预训练的工作“ProtoCLIP: Prototypical Contrastive Language Image Pretraining”被 IEEE Transactions on Neural Networks and Learning Systems (TNNLS) （中科院一区Top期刊）录用！我们设计了一种基于原型聚类的高效视觉-语言预训练方法 ProtoCLIP，在 ImageNet 线性评估与零样本预测任务上分别相较 CLIP 提升 +5.81% 和 +2.01% 准确率，在1400万样本数据集上的大规模实验表明，ProtoCLIP 能够以缩短3倍训练时间的条件下达到与 CLIP 相近的性能。论文预印本和代码已公开发布。
2023-11-05 我们在第十一届全国大学生数字媒体科技作品及创意竞赛中获得了国家三等奖、江苏省三等奖，恭喜张颢骞、周道杰、崔金凤、蒋郭鑫、霍健团队以及徐圣翔(本科生)、管张青云、高硕(本科生)、高兴(本科生)团队！
2023-11-01 刘凡教授获2023年江苏省高等学校微课教学比赛一等奖
2023-10-01 我们关于单样本人脸识别的论文"Single Sample Face Recognition Based on Identity-Attribute Disentanglement and Adversarial Feature Augmentation"已被第十七届中国生物特征识别大会（CCBR2023）录用。恭喜姚亮、欧志权(本科生)、王菲！
2023-08-29 论文“基于结构光和CT的背部点云配准算法研究”已被《激光与光电子学进展》期刊录用，恭喜沈春梅！
2023-08-25 论文“基于动态频域分解的乐队指挥动作生成”已被《计算机应用研究》期刊录用，恭喜贺鑫，周睿志(本科生)！
2023-08-21 我们关于小样本图像识别的论文“Few-shot Classification Guided by Generalization Error Bound ”已被 Pattern Recognition (中科院1区TOP) 录用
2023-07-19 我们关于小样本图像识别的论文“JLCSR: Joint Learning of Compactness and Separability Representation for Few-shot Classification”已被 IEEE Transactions on Cognitive and Developmental Systems 录用。
2023-06-19 我们构建了首个面向遥感场景的通用视觉-语言基础模型 RemoteCLIP。通过将多源异构的数据标注统一为以自然语言为中心的图像语义描述，我们将预训练数据集扩充至现有数据的12倍。在遥感图文检索评测中，RemoteCLIP 在 RSICD 数据集和 RSICD 数据集上大幅领先现有最佳方法（+9.14%，+8.92%），在12个下游数据集的零样本识别任务上，RemoteCLIP 的准确率超过基线方法 6.39%。论文的 arxiv 预印本现已公开。
2023-06-04 恭喜欧志权，严旻茜，刘宇洋，刘亦凡，丁洋洋团队参加第十四届中国大学生服务外包创新创业大赛 (安全人脸识别认证系统赛题）获得国家三等奖！
2023-05-29 我们的论文《多模态大模型小样本迁移方法研究进展综述》已被 2023中国多媒体大会（ChinaMM）录用。恭喜张天舒、陈德龙、管张青云、蔡雯雯、周晓聪！
2023-04-20 我们关于小样本图像识别的论文 “Few-shot Classification via Ensemble Learning with Multi-Order Statistics” 现已被CCF-A类会议IJCAI-23录用！
2023-03-11 我们关于大规模多模态电商数据集的论文“MEP-3M: A Large-scale Multi-modal E-Commerce Products Dataset”被中科院1区TOP期刊 Pattern Recognition 录用，恭喜陈德龙、高睿琢！
2023-01-22 新春快乐！AIM 实验室正式启用新版官方主页（域名 https://multimodality.group）！

2022

2022-12 在本年度中，AIM 实验室成员共分享了近800篇共享文档，其中包括150+篇论文的精读笔记、4门在线课程的学习笔记等。
2022-12 刘凡教授获2022年江苏省高等学校微课教学比赛三等奖、江苏省信息技术应用应用学会青年科技奖一等奖。
2021-07 我们关于卷积神经网络的综述论文 “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects” 被评为 ESI 高被引论文、热点论文！
2022-07 我们关于深度单样本人脸识别的综述论文 “Deep Learning Based Single Sample Per Person Face Recognition: A Survey” 现已经发表在 SCI 期刊 Artificial Intelligence Review （JCR Q1，中科院2区，影响因子9.588），恭喜陈德龙（本科生），王菲！
2022-07 刘凡教授入选江苏省青蓝工程优秀骨干教师、江苏省科协青年科技人才托举工程。
2022-06. 陈德龙的河海大学优秀本科毕业论文《基于动态频域分解与跨模态感知的乐队指挥动作生成系统》现已被评为江苏省优秀毕业论文一等奖，恭喜！
2022-03 论文 “Self-Supervised Music Motion Synchronization Learning for Music-Driven Conducting Motion Generation” 现已被 CCF-B 类期刊 Journal of Computer Science and Technology (计算机科学与技术学报) 录用！论文构建的 ConductorMotion100 数据集现已于江苏省计算机学会主办的首届国际“远见杯”元智能数据挑战大赛中作为动作认知赛道赛题公开（赛题主页）。

2021

2021-09 刘凡教授编著的《JSP基础入门》（Introduction to JSP）已由清华大学出版社出版。
2021-08 我们构建的包含三百万图文样本对的大规模多模态电商商品数据集 MEP-3M 在 CCF-A 类顶级会议 IJCAI'23 长尾分布学习研讨会 (LTDL) 中被评为 Best Dataset Paper（最佳数据集论文）！
2021-08 我们的论文《基于并行注意力UNet的裂缝检测方法》已由计算机研究与发展（CCF推荐A类中文期刊、计算机领域高质量科技期刊T1类、北大中文核心期刊）出版，恭喜王君锋、陈峙宇！
2021-07 我们音乐驱动的指挥视频生成系统 VirtualConductor 获评 CCF-B 类会议 ICME'21 Best Demo！
2021-07 恭喜陈德龙（本科生）关于论文 “Significant Wave Height Prediction based on Wavelet Graph Neural Network” 在国际会议 BDAI'21 中的口头报告获评 Best Presentation！
2021-06 我们的论文 “A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects” 被 IEEE Transactions on Neural Networks and Learning Systems （中科院一区Top期刊，影响因子:14.255）录用，恭喜李泽文（本科生）、杨文杰（本科生）、彭守恒（本科生）！
2021-06 我们关于跨模态行人重识别的论文 “Global-Local Multiple Granularity Learning for Cross-Modality Visible-Infrared Person Reidentification” 已在 IEEE Transactions on Neural Networks and Learning Systems （中科院一区Top期刊，影响因子:14.255）出版！

研究方向与主要成果

Research Areas and Outputs

▶ 视觉-语言大规模多模态预训练

Shiyu Miao(本科生), Delong Chen, Fan Liu*, et al. Prompting DirectSAM for Semantic Contour Extraction in Remote Sensing Images. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025: 1-5. (CCF-B)
Fan Liu*, Delong Chen, Zhangqingyun Guan, et al. **Remoteclip: A vision language foundation model for remote sensing[J].** IEEE Transactions on Geoscience and Remote Sensing, 2024. (SCI, 一区TOP期刊)
Fan Liu*, Delong Chen, Xiaoyu Du, Ruizhuo Gao, Feng Xu. MEP-3M: A Large-scale Multi-modal E-Commerce Products Dataset. Pattern Recognition, 2023 (SCI, 中科院一区Top期刊, JCR一区, IF: 8.518).
Delong Chen (本科生), Fan Liu*, Xiaoyu Du, Ruizhuo Gao, Feng Xu. MEP-3M: A Large-scale Multi-modal E-Commerce Products Dataset. IJCAI 2021 Workshop on Long-Tailed Distribution Learning (CCF-A类会议Workshop, Oral). [数据集主页][Best Dataset Paper Award]
Delong Chen, Zhao Wu, Fan Liu*, et al. ProtoCLIP: Prototypical Contrastive Language Image Pretraining. ArXiv Pre-print, 2022. [开源代码]
Yanling Pan (本科生), Ruizhi Zhou (本科生), Gang Zhao, Weijuan Zhang, Delong Chen, Fan Liu*. MDF-Net: Multimodal Deep Fusion for Large-Scale Product Recognition. The 16th Chinese Conference on Biometric Recognition, Biometric Recognition, CCBR 2022.
开发图文特征对齐公开代码库 ITRA: Image Text Representation Alignment. [文档][开源代码]
国家发明专利：一种基于商品文本分类的电商类目属性挖掘方法. 专利号: 201910599049.6.
国家发明专利：一种基于频繁项目集的图片标注推荐算法. 专利号: 201811516054.8.
国家发明专利：一种基于RemoteCLIP图像编码器的多尺度特征融合遥感图像语义分割方法. 专利号：202410414668.4.
国家发明专利：一种基于视觉-语言模型知识蒸馏的水下目标检测方法. 专利号：202411570471.6.
国家发明专利：一种基于自适应局部知识填充的遥感图文检索方法. 专利号：202411570529.7

▶ 空天遥感多模态环境感知

Liang Yao, Fan Liu*, Delong Chen, et al. RemoteSAM: Towards Segment Anything for Earth Observation.
Guangwenjie Zou(本科生), Liang Yao, Fan Liu*, et al. Remotetrimmer: Adaptive structural pruning for remote sensing image classification. ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025: 1-5. (CCF-B)
Jiale Zhu, Liang Yao, Fan Liu*, et al. A Encoder-Decoder Framework for Foundation Model-based Remote Sensing Semantic Segmentation Proceedings of the 6th ACM International Conference on Multimedia in Asia Workshops. 2024: 1-7. (CCF-C)
Fan Liu*, Liang Yao, et al. Boost UAV-based Object Detection via Scale-Invariant Feature Disentanglement and Adversarial Learning. IEEE Transactions on Geoscience and Remote Sensing, 2025. (SCI, 一区Top期刊, IF: 8.250)
Sai Yang, Bin Hu, Bojun Zhou, Fan Liu*, et al. Power Line Aerial Image Restoration under Adverse Weather: Datasets and Baselines. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2025. (SCI 二区Top期刊)
Liang Yao, Fan Liu*, Chuanyi Zhang, et al. Domain-Invariant Progressive Knowledge Distillation for UAV-Based Object Detection. IEEE Geoscience and Remote Sensing Letters, 2024.
Xin Li, Feng Xu, Hongmin Gao, Fan Liu, et al. A frequency domain feature-guided network for semantic segmentation of remote sensing images. IEEE Signal Processing Letters, 2024. (SCI, 二区TOP期刊)
Xin Li, Feng Xu, Linyang Li, Nan Xu, Fan Liu, et al. AAFormer: Attention-attended transformer for semantic segmentation of remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2024. (SCI, 二区TOP期刊)
Xin Li, Feng Xu, Fan Liu, et al. Semantic segmentation of remote sensing images by interactive representation refinement and geometric prior-guided inference. IEEE Transactions on Geoscience and Remote Sensing, 2023, 62: 1-18. (SCI, 一区TOP期刊)
Xin Li, Feng Xu, Fan Liu, et al. A synergistical attention model for semantic segmentation of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-16. (SCI, 一区TOP期刊)
Xin Li, Feng Xu, Fan Liu, et al. Hybridizing Euclidean and hyperbolic similarities for attentively refining representations in semantic segmentation of remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 1-5. (SCI, 二区TOP期刊)
国家发明专利：一种基于通道注意力和自适应学习的模型剪枝方法. 专利号：202411570310.7.
国家发明专利：一种遥感图像语义分割方法及装置. 专利号：202210478048.8.
国家发明专利：一种基于多源先验的无人机目标视觉定位方法. 专利号：202311588919.2.
国家发明专利：基于分级递进和集体知识的无人机目标检测蒸馏方法. 专利号：202410870827.1.

▶ 预训练基础模型的小样本下游泛化

Fan Liu*, Wenwen Cai, Jian Huo, et al. Making Large Vision Language Models to be Good Few-shot Learners. Proceedings of the AAAI Conference on Artificial Intelligence. 2025, 39(5): 5415-5423. (CCF-A)
陈宁, 刘凡*, 董晨炜, 等. 基于局部对比学习与新类特征生成的小样本图像分类. 模式识别与人工智能, 2024: 1. (CCF-B)
Xiaocong Zhou, Fan Liu*, et al. Feature-weighted Multi-stage Bayesian Prototype for Few-shot Classification Proceedings of the 6th ACM International Conference on Multimedia in Asia. 2024: 1-7. (CCF-C)
Fan Liu*, Tianshu Zhang, Wenwen Dai, Chuanyi Zhang, Wenwen Cai, Xiaocong Zhou, et al. Few-shot adaptation of multi-modal foundation models: A survey. Artificial Intelligence Review, 2024, 57(10): 268. (SCI,二区TOP期刊,IF:9.588)
Sai Yang, Fan Liu*, Delong Chen, et al. **Few-shot Classification Model Compression via School Learning[J].** IEEE Transactions on Circuits and Systems for Video Technology, 2024. (SCI,一区TOP期刊)
Tianshu Zhang, Wenwen Dai, Zhiyu Chen, Sai Yang, Fan Liu*, et al. Few-Shot Image Classification via Mutual Distillation. Applied Sciences, 2023, 13(24): 13284.
国家发明专利：一种基于全局和局部对比学习的小样本图像分类方法. 专利号：202311588879.1.
国家发明专利：一种基于元学习的小样本目标检测方法. 专利号：202310322905.X.

▶ 单样本多模态人脸识别与分析

Zhiquan Ou(本科生), Liang Yao, Ting Wu, et al. AerialFace: A Light Weight Framework for Unmanned Aerial Vehicle Face Recognition. 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG). IEEE, 2024: 1-7. (CCF-C)
Liang Yao, Fan Liu*, Zhiquan Ou(本科生), et al. Single Sample Face Recognition Based on Identity-Attribute Disentanglement and Adversarial Feature Augmentation Chinese Conference on Biometric Recognition. Singapore: Springer Nature Singapore, 2023: 212-222.
Fan Liu*, Fei Wang, Yuhua Ding, et al. SOM-based binary coding for single sample face recognition. Journal of Ambient Intelligence and Humanized Computing, 2022, 13(12): 5861-5871.
Fan Liu*, Delong Chen (本科生), et al. Deep Learning based Single Sample Face Recognition: A Survey. Artificial Intelligence Review, AIRE, 2022 (SCI, JCR一区, IF: 9.588). [预印本]
Fan Liu*, Delong Chen (本科生), et al. A Review of Driver Fatigue Detection and Its Advances on the Use of RGB-D Camera and Deep Learning. Engineering Applications of Artificial Intelligence, EAAI, 2022. (SCI, JCR一区, IF: 7.802).
Zewen Li (本科生), Fan Liu*, et al. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems, 2021 (SCI, 一区Top期刊，IF:14.255, 引用459次, ESI高被引论文、热点论文). [预印本]
Liyan Zhang, Guodong Du, Fan Liu*, et al. Global-Local Multiple Granularity Learning for Cross-Modality Visible-Infrared Person Reidentification. IEEE Transactions on Neural Networks and Learning Systems, 2021 (SCI, 一区Top期刊, IF:14.255).
Qiaolin Ye, Jian Yang, Fan Liu, Chunxia Zhao, Ning Ye, Tongming Yin: L1-norm Distance Linear Discriminant Analysis Based on An Effective Iterative Algorithm. IEEE Transactions on Circuits and Systems for Video Technology, 2018 (SCI, 一区Top期刊, IF：5.859, 引用97次).
Fan Liu*, Jinhui Tang, Yan Song, Ye Bi, Sai Yang. Local Structure based Multi-Phase Collaborative Representation for Face Recognition with Single Sample per Person. Information Sciences, 2016 (SCI, 一区Top期刊, IF：8.233, 引用101次).
Fan Liu, Jinhui Tang, Yan Song, Liyan Zhang, Zhenmin Tang. Local Structure based Sparse Representation for Face Recognition. ACM Transactions on Intelligent Systems and Technology, 2015 (SCI, JCR一区, IF: 10.489).
Liyan Zhang, Fan Liu, Jinhui Tang. Real-time System for Driver Fatigue Detection by RGB-D Camera. ACM Transactions on Intelligent Systems and Technology, 2015 (SCI, JCR一区, IF：10.489).
Yan Song, Jinhui Tang, Fan Liu, Shui Cheng Yan. Body Surface Context: A New Robust Feature for Action Recognition From Depth Videos. IEEE Transactions on Circuits and Systems for Video Technology, 2014. (SCI, 一区Top期刊, IF：5.859, 引用65次).
Fan Liu, Zhenmin Tang, Jinhui Tang. WLBP: Weber Local Binary Pattern for Local Image Description. Neurocomputing, 2013 (SCI, Top期刊, IF：5.779, 引用76次).
Fan Liu, Jinhui Tang, Ruizhen Zhao, Zhenmin Tang. Abnormal behavior recognition system for ATM monitoring by RGB-D camera. Proceedings of the 20th ACM international conference on Multimedia, ACM MM, 2012. (CCF-A类会议，多媒体领域顶会)
国家发明专利：基于局部子空间稀疏表示的单样本人脸识别方法. 专利号: 201310700295.9
国家发明专利：基于半监督子块联合回归的单样本人脸识别方法. 专利号: 201611010248.1.
国家发明专利：基于块稀疏结构低秩表示的单样本人脸识别方法. 专利号: 201610701068.1.
国家发明专利：基于卷积神经网络的多通道人眼闭合识别方法. 专利号: 201711429165.0.
国家发明专利：一种基于词袋模型的单样本人脸识别方法. 专利号: 201711155556.8.
国家发明专利：基于循环自编码器和块稀疏结构表示的单样本人脸识别方法. 专利号: 202011403336.4.
国家发明专利：一种基于两层自组织神经网络的人脸特征二进制编码与识别方法. 专利号: 202010627333.2
国家发明专利：基于对抗学习的人脸变化解耦的亲属关系验证方法. 专利号: 202111386833.2.

▶ 数据驱动的水文时间序列预测

刘凡,陆小敏, 徐丹, 戴雯雯 (本科生), 李慧洲. 海浪预报方法研究进展. 河海大学学报 (自然科学版), 2021 (中文核心期刊).
Fan Liu*, Feng Xu, Sai Yang. A Flood Forecasting Model Based on Deep Learning Algorithm via Integrating Stacked Autoencoders with BP Neural Network. 2017 IEEE Third International Conference on Multimedia Big Data (BigMM). [引用64次]
Delong Chen (本科生), Fan Liu, Zheqi Zhang, Xiaomin Lu, Zewen Li. Significant Wave Height Prediction based on Wavelet Graph Neural Network. 2021 IEEE 4th International Conference on Big Data and Artificial Intelligence, BDAI (Oral). [Best Presentation Award]
Delong Chen, Ruizhi Zhou (本科生), Yanling Pan (本科生), Fan Liu. A Simple Baseline for Adversarial Domain Adaptation-based Unsupervised Flood Forecasting. Technical Report, ArXiv (2022). [开源代码]
开发水文时间序列预测公开代码库：HH💦Forecasting: a codebase for data-driven hydrological time-series forecasting. [开源代码]

▶ 视觉小样本学习

Fan Liu, Sai Yang, Delong Chen, Huaxi Huang, Jun Zhou. Few-shot classification guided by generalization error bound. Pattern Recognition, 2023 (CCF-B).
Sai Yang, Fan Liu*, Shaoqiu Zheng, Ying Tan. JLCSR: Joint Learning of Compactness and Separability Representation for Few-shot Classification. IEEE Transactions on Cognitive and Developmental Systems, 2023.
张天舒，刘凡，陈德龙，管张青云，蔡雯雯，周晓聪. 《多模态大模型小样本迁移方法研究进展综述》. 2023中国多媒体大会（ChinaMM）.
Sai Yang, Fan Liu*, Delong Chen, Jun Zhou. Few-shot Classification via Ensemble Learning with Multi-Order Statistics. IJCAI-23, 2022 (CCF-A, Oral).
Fan Liu, Feifan Li, Sai Yang. Few‐shot classification using Gaussianisation prototypical classifier. IET Computer Vision, 2022 (SCI).
Sai Yang, Fan Liu*, Zhiyu Chen. Feature hallucination in hypersphere space for few-shot classification. IET Image Process, 2022 (SCI).
Jiaying Wu, Ning Dong, Fan Liu, Sai Yang, Jinglu Hu. Feature hallucination via Maximum A Posteriori for few-shot learning. Knowledge-Based Systems, 2021 (SCI, 中科院一区, JCR 一区, CCF推荐期刊, IF: 8.139)
Sai Yang, Fan Liu, Ning Dong, Jiaying Wu. Comparative Analysis on Classical Meta-Metric Models for Few-Shot Learning. IEEE Access, 2020 (SCI).

▶ 基于计算机视觉的大坝、桥梁监测

刘凡，王君锋，陈峙宇, 许峰. 基于并行注意力UNet的裂缝检测方法. 计算机研究与发展 2021. [CCF-A类期刊]
Junfeng Wang, Fan Liu*, Wenjie Yang (本科生), Guoyan Xu, Tao Zhang. Pavement Crack Detection Using Attention U-Net with Multiple Sources. Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020.
Fan Liu, Junfeng Wang, Delong Chen et al. Asymmetric exponential loss function for crack segmentation. Multimedia Systems, 2022: 1-13. [SCI, CCF推荐期刊]
Kun Xie, Dong Lei, Wenkang Du, Pengxiang Bai, Feipeng Zhu, Fan Liu. A new operator based on edge detection for monitoring the cable under different illumination. Mechanical Systems and Signal Processing, 2023 (SCI, JCR一区, IF: 8.934).
Kun Xie, Dong Lei, Wenkang Du, Pengxiang Bai, Feipeng Zhu, Fan Liu. The monitoring of bridge under complex illumination based on digital image technology. Measurement, 2023 (SCI, JCR一区, IF: 5.131).
国家发明专利：一种基于迁移学习的大坝图像裂缝检测方法. 专利号: 201810972498.6.
国家发明专利：一种基于U-net网络和SC-SAM注意力机制的大坝裂缝检测方法. 专利号: 202011049216.9.
国家发明专利：基于多迁移学习模型融合的大坝裂缝检测方法. 专利号: 201910845138.4
国家发明专利：基于背景与目标先验的多尺度扩散显著目标检测方法. 专利号: 201810243956.2.

▶ 音乐驱动的乐队指挥动作生成

Fan Liu, Delong Chen (本科生), Ruizhi Zhou (本科生), Sai Yang, Feng Xu. Self-Supervised Music-Motion Synchronization Learning for Music-Driven Conducting Motion Generation. Journal of Computer Science and Technology (SCI, CCF-B类期刊). [开源代码]
Delong Chen (本科生), Fan Liu*, Zewen Li (本科生), Feng Xu. VirtualConductor: Music-driven Conducting Video Generation System. IEEE International Conference on Multimedia and Expo, ICME 2021 (CCF-B类会议). [Best Demo Award]
Demo 视频： 虚拟指挥（biliili 35w 播放，815评论）；Demo 视频：人工智能指挥 demo（全明星）（bilibili 1.8w 播放）；图灵测试视频：Demo: music-driven conducting motion generation.
首届国际“远见杯”元智能数据挑战大赛（江苏省计算机学会主办），动作认知赛道：音乐驱动的指挥动作生成. [赛题主页]
国家发明专利：基于动态频域分解的音乐驱动的指挥动作生成方法. 专利号: 202111090067.5
国家发明专利：一种基于自监督跨模态感知损失的乐队指挥动作生成方法. 专利号 202111090024.7

Artificial Intelligence of Multi-modality Group (AIM Group)

多模态人工智能实验室

AIM Group

Artificial Intelligence of Multi-modality Group

多模态人工智能实验室

河海大学计算机与软件学院

河海大学人工智能与自动化学院

实验室动态

2025

2024

2023

2022

2021

研究方向与主要成果

▶ 视觉-语言大规模多模态预训练

▶ 空天遥感多模态环境感知

▶ 预训练基础模型的小样本下游泛化

▶ 单样本多模态人脸识别与分析

▶ 数据驱动的水文时间序列预测

▶ 视觉小样本学习

▶ 基于计算机视觉的大坝、桥梁监测

▶ 音乐驱动的乐队指挥动作生成

Artificial Intelligence of Multi-modality Group (AIM Group)

多模态人工智能实验室

AIM Group

Artificial Intelligence of Multi-modality Group

多模态人工智能实验室

河海大学计算机与软件学院

河海大学人工智能与自动化学院

实验室动态

2025

2024

2023

2022

2021

研究方向与主要成果

▶ 视觉-语言大规模多模态预训练

▶ 空天遥感多模态环境感知

▶ 预训练基础模型的小样本下游泛化

▶ 单样本多模态人脸识别与分析

▶ 数据驱动的水文时间序列预测

▶ 视觉小样本学习

▶ 基于计算机视觉的大坝、桥梁监测

▶ 音乐驱动的乐队指挥动作生成

关键词