Logan Zou

I am an undergraduate student in University of International Business and Economics, School of Information Technology and Management, majoring in Data Science and Big Data Technology. I am a constant learner and explorer, enthusiastic about participating in open-source projects related to NLP and LLM. I have show my open-source and academic interests as follow. If you are interested in my experiences and you have any things or questions to discuss ,do not hesitate to contact me.

Email / Github

Research Interests

My research interests lie in natural language processing and large language models. I am currently a master's student at the University of International Business and Economics, supervised by Professor Dongyuan Lu. My research focuses on large language models, such as LLM-based agent, LLM finetuning, comparative analysis of industry economic texts based on LLMs and so on.

Large language models are my prospective research direction. I have a strong interest in fine-tuning LLM for specific domains, prompt engineering, dialogue strategies, personalized evaluation, application development, and other aspects related to LLM. I have been actively involved in various open-source projects related to LLM and have gained some experience.

Open-Source Experience

LLM Cookbook project webpage

Principal and main contributer

An introductory tutorial on Large Language Models (LLMs) for developers, based on the course content from Professor Andrew Ng's series on LLMs. This tutorial translates the original course content into Chinese, reproduces its example code, implements Chinese prompts, and explores multilingual contextual prompts for large models. It aims to guide Chinese developers on how to rapidly and efficiently develop powerful applications based on LLMs.

18.7k stars, 2.3k forks, 2 times in Github Trending

Self LLM project webpage

Co-Principal and main contributer

A Chinese tutorial for open source LLMs for domestic beginners, providing a full-process guide for various open source LLMs, including environment configuration, local deployment, efficient fine-tuning, and so on. This project aims to simplify the deployment, use, and application process of open source LLMs, allowing more ordinary students and researchers to better use open source LLMs.

14.7k stars, 1.7k forks, 3 times in Github Trending, showed in Google 2024 I/O

LLM Universe project webpage

Principal and main contributer

A concise and comprehensive tutorial on LLM development goals at providing a focused introduction to LLM development through a half-day course. This tutorial starts from personal knowledge assistant projects. breaks down the general process and steps of LLM development in a clear and easy-to-understand manner. Additionally, we have planned and encapsulated the project in a clear and comprehensive manner, achieving the unified integration of different LLM APIs into the project.

7.7k stars, 874 forks, 2 times in Github Trending

Tiny Universe project webpage

Co-Principal and main contributer

A Chinese tutorial for 'handcrafted' LLM, starting from principles and oriented towards a 'white box' approach, that revolves around the entire process of LLM. This project aims to assist readers with a foundation in traditional deep learning to build a clear and usable LLM system from the ground up, 'purely by hand'. This includes the large model itself, the RAG framework, the Agent system, and the large model evaluation system.

2.7k stars, 282 forks

Work Experience

LLM Algorithm Engineer - rednote

Worked in rednote-AI Platform, Responsible for the exploration and implementation of LLM about some AI projects in rednote.

LLM Algorithm Intern - Bytedance

Worked in Bytedance-Tiktok, Responsible for the exploration and implementation of LLM about data privacy and security.

Responsible for the Privacy Data Understanding Intelligent Agent. The project has been launched and is used across the PnS in TikTok

Established the RAG+Workflow intelligent agent framework, constructed business workflows for multiple types of privacy labels, and implemented a multi-functional conversational ChatBot for privacy data understanding.
Based on the annotation rules of domain experts, expanded the retrieval data sources, optimized the construction of the vertical domain knowledge base and the business knowledge retrieval strategy, and achieved business delivery with an F1 score of 80 for two types of privacy labels.

Responsible for the Data Type Automatic Evaluation project based on LLM, achieved the new delivery of 40 labels. The project has been launched to assist in the iteration of privacy data understanding.

Filtered and integrated multiple batches of historical annotation data based on the fusion model, and achieved an increase in the delivered labels from 41 to 78 through LLM SFT, improving the automated asset coverage by 30%.
Regarding the issue that the rule changes for multiple labels caused the delivered labels in the latest version to drop to 54, introduced RL algorithms such as DPO to learn the preferences of the latest rules, and achieved a increase of the delivered labels to 81.

LLM Algorithm Intern - Baidu

Worked in Baidu Search, Responsible for the exploration and implementation of LLM about text generation

Responsible for the image editing assistant project based on Language User Interaction (LUI), the project has been positively tested with low traffic.

Based on more than twenty image editing operators, a natural language interaction scheme based on large models was designed
For the first phase Bad Case, a MLLM-based solution designed, which use Qwen-VL finetuning. Achieving a 34-point increase in usability.

Responsible for the handwritten newspaper AIGC generation project, which has been launched on the Baidu search vertical homepage.

Design a three-stage solution: user query intent recognition + RAG retrieval + copywriting generation, achieving a 32-point increase in the usability of the generated copy and a 16-point increase in satisfaction rate
Recognize the real theme of user queries and image search titles by intent recognition, use the real theme and user requirements were to improve the relevance, use the RAG retrieval + credibility screening solve the hallucination problem

Responsible for the continued pre-training for the group's self-developed LLM, achieved performance improvement of the model on multiple business tasks.

Researched various LLM architectures, experimentally evaluated the optimization plan of migrating Dense LLM to MoE architecture, pruning from 7B model to 3B and continuing pre-training.
Researched and implemented various length extrapolation schemes, and through experimental comparison, selected the expanded length pre-training + NTK scheme to achieve the model context expansion from 2K to 16K.

LLM Algorithm Intern - Ytell

Ytell is a tech innovation company centered AI and LLM technology, whose core members come from major internet and AI tech companies such as Baidu, Didi, Alibaba, and Fourth Paradigm.

Responsible for exploring solutions related to LLMs, application implementation, and iterative optimization:

Open-source LLM domain-specific fine-tuning, involving workflows for multimodal data processing, efficient instruction-tuning, and constructing evaluation metric system for domain-specific LLM.
Business problem solutions and implementation based on LLM, including automatic order extraction, high-quality manuscript generation, intelligent assistance for user operations, etc.
Development of a health-related question-answering assistant based on the Agent mechanism, primarily responsible for framework ideas, data construction, model optimization, performance testing, and evaluation.

Algorithm Intern - Dr.Peng

Dr. Peng is a publicly listed group focused on the communication and internet industry, possessing a nationwide comprehensive business operation license.

Responsible for the design and implementation of quantitative algorithms, and the construction of financial data analysis platforms:

Implementation quantitative strategies with Python, enabling the transformation from formal language to program descriptions.
Developing a module for quantitative stock trading, incorporating price segmentation based on MACD, defining price trends for quantification and so on.
Establishing a financial data visualization platform, involving the design of the local database structure, writing functions for remote data migration, and creating data API documentation.

Data Analysis Intern - Erawork

Erawork is a technology-driven shared workspace and office operation platform utilizing AI and big data.

Responsible for user data analysis, including utilizing various web scraping techniques to obtain user data from the Geek platform, leveraging the user data to create user profiles, filter out inactive users, and visualize relationship networks.

Research Experience

A Neural-Ensemble Learning Method for Migration Prediction Based on Culinary Taste Data in China

Accepted by SCI Journal of Nonlinear and Convex Analysis

Authors: Zou Yuheng, Huang Yicheng, Yan Chengxin, La Lei

Abstract: Population migration is an important problem related to national economic and social development, and migration data can be applied to research in many fields. But population loss often puts huge pressure on local governments, so migration data are not disclosed in many cases. Most of the existing migration prediction models are based on non open source data, when other researchers want to apply existing population migration prediction models to carry out their own prediction tasks, they often find that they cannot obtain the same data source. This paper proposes a Neural-Ensemble learning method for migration prediction based on taste data in China. The whole method can be divided into three parts. First, classify the restaurants into different cuisines, calculate the taste of each cuisine based on the recipe data and then obtain the taste matrix of China. In this step, we propose a method for restaurant classification called Neural-Ensemble Classification, which combines the BERT and dictionary matching. Then we construct a Markov Chain to predict the vector of migration at the same time with restaurant data based on the historical migration data. Finally, we build a prediction model based on the LightGBM, which uses the taste matrix as input and the vector of migration as output. Compared with existing models, this model can use open data to achieve the prediction accuracy no lower than existing models.

A Comparative Study of China And the United States' Digital Economy Policies Based on Cross-lingual Mode

Accepted by Chinese Core Conference SMP 2023 and Chinese Core Journal Complex Systems and Complexity Science

Authors: Zou Yuheng, Lu Dongyuan

Abstract: In the context of escalating Sino-American strategic competition, a comparative study of Chinese and the USA digital economy policies bears significant strategic value. Traditional methods of policy comparison are limited by cost, can ’t solve this problem well. This paper focuses on the contrast between digital economy policies in China and th e USA, proposing a resolution framework based on a cross-language model. This framework enables the comparison of digital economy policy environments in both countries based on massive policy data and further suggests policy recommendations for the develop ment of the digital economy. This paper offers a solution for comparing policy environments across different political systems, providing a comprehensive and objective portrayal of the disparities in digital economy policy environments. Concurrently, it also brings a fresh perspective to policy comparison research.

Why Guests Write Negative Comments for Budget Hotels：Research Based on Aspect Extraction

Accepted by SCI Journal of Nonlinear and Convex Analysis

Authors: La Juanjuan, La Lei, Zou Yuheng

Abstract: Negative comments reﬂect customer dissatisfaction. Identifying this dissatisfaction is of high signiﬁcance to improve the hotel industry. At present, the mining of negative comments mainly focuses on luxury hotels. In fact, budget hotels occupy a large market share. This paper proposes a method for online comment extraction in the hotel ﬁeld. The method realizes weakly-supervised learning based on BiLSTM and CRF and can further improve the extraction performance by using a labeled open dataset in the hotel domain. The real-world application of the proposed method reveals the dissatisfaction of economy hotel customers mainly focuses on the price, noise, service, and cleanliness of facilities. Experimental results also show that the proposed method has a higher F1 score in the supervised and weakly-supervised situations than the control methods. It is a powerful tool for managers and researchers in the hospitality industry and can support many downstream applications.

Competition Experience

Top 3 - Competition of Text Classification and Keyword Extraction Based on Paper Abstracts

Organized by iFLYTEK

For the tasks of text classification and keyword extraction, we proposed a solution algorithm combining 6B fine-tuned LLM with GPT-4 supervision, achieving Top 3 in the preliminary round and Top 1 in the long-term competition.

Top 3 - Competition of Job-Seeker Position Matching

Organized by iFLYTEK

for the task of matching resumes and job positions, we proposed two solution algorithms: desensitized data re-pretraining + full-process Fine-tune Bert + long-text strategy and feature engineering + autogluon. Both approaches achieved Top3.

Top 12 - Pu Yuan LLM Competition of InternLM

Organized by Shanghai AI Lab

Based on the open-source project Chat-Zhenhuan, combined with InternLM, a complete, replicable, and fully automatic system for building a Role-Play LLM (Large Language Model) has been developed. This system enables the construction of personalized LLMs for any novel and any character. On the foundation of our project, dozens of excellent Role-play projects have been derived from the InternLM training camp.

Top 6 - National Competition for Innovative Applications of Large Language Models

Organized by dataology

Developed a research polishing tool based on LLMs, implementing functions such as paper polishing, automatic abstract generation, and citation creation. We are also trying to finetune a domain-specific LLM for ourselves.

Top 50 - 'Spark Cup' Cognitive LLM Scene Innovation Competition

Organized by iFLYTEK

Based on the open-source project Chat-Zhenhuan, combined with iFlytek's LLM Spark, we have built a personalized AI system that is highly efficient, serviceable, and suitable for commercial use. We have proposed a solution that combines general LLMs with locally fine-tuned LLMs, overcoming the service limitations of fine-tuning open-source LLMs.