Logan Zou
I am an undergraduate student in University of International Business and Economics, School of Information Technology and Management, majoring in Data Science and Big Data Technology.
I am a constant learner and explorer, enthusiastic about participating in open-source projects related to NLP and LLM.
I have show my open-source and academic interests as follow.
If you are interested in my experiences and you have any things or questions to discuss ,do not hesitate to contact me.
Email  / 
Github
|
|
Research Interests
My research interests lie in natural language processing and large language models.
I am currently a master's student at the University of International Business and Economics,
supervised by Professor Dongyuan Lu.
My research focuses on large language models, such as LLM-based agent, LLM finetuning, comparative analysis of industry economic texts based on LLMs and so on.
Large language models are my prospective research direction.
I have a strong interest in fine-tuning LLM for specific domains,
prompt engineering, dialogue strategies, personalized evaluation, application development, and other aspects related to LLM.
I have been actively involved in various open-source projects related to LLM and have gained some experience.
|
LLM Cookbook
project webpage
Principal and main contributer
An introductory tutorial on Large Language Models (LLMs) for developers,
based on the course content from Professor Andrew Ng's series on LLMs.
This tutorial translates the original course content into Chinese,
reproduces its example code, implements Chinese prompts, and explores multilingual contextual prompts for large models.
It aims to guide Chinese developers on how to rapidly and efficiently develop powerful applications based on LLMs.
11.4k stars, 1.4k forks
LLM Universe
project webpage
Principal and main contributer
A concise and comprehensive tutorial on LLM development
goals at providing a focused introduction to LLM development through a half-day course.
This tutorial starts from personal knowledge assistant projects.
breaks down the general process and steps of LLM development in a clear and easy-to-understand manner.
Additionally, we have planned and encapsulated the project in a clear and comprehensive manner,
achieving the unified integration of different LLM APIs into the project.
4.4k stars, 543 forks, 2 times in Github Trending
Self LLM
project webpage
Co-Principal and main contributer
A Chinese tutorial for open source LLMs for domestic beginners,
providing a full-process guide for various open source LLMs,
including environment configuration, local deployment, efficient fine-tuning, and so on.
This project aims to simplify the deployment, use, and application process of open source LLMs,
allowing more ordinary students and researchers to better use open source LLMs.
7.9k stars, 946 forks, 3 times in Github Trending, showed in Google 2024 I/O
Tiny Universe
project webpage
Co-Principal and main contributer
A Chinese tutorial for 'handcrafted' LLM,
starting from principles and oriented towards a 'white box' approach,
that revolves around the entire process of LLM.
This project aims to assist readers with a foundation in traditional deep learning
to build a clear and usable LLM system from the ground up,
'purely by hand'.
This includes the large model itself, the RAG framework,
the Agent system, and the large model evaluation system.
1k stars, 96 forks
Thorough Pytorch
project webpage
Main contributer, responsible for NLP and Transformer-related content.
An open-source Chinese tutorial of Pytorch
that comprehensively covers the usage of PyTorch from theory to practice,
emphasizing practicality, readability, and extensibility.
It has received support from the Computer Department of People's Posts and Telecommunications Press
and the MMYOLO open-source algorithm library from Shanghai Pudong AI Laboratory.
2.4K stars, 408 forks
InternLM-Tutorial
project webpage
Main contributer, responsible for RAG application based on InternLM.
A full-chain course on large models from the Shanghai Artificial Intelligence Laboratory,
covering the overview of large language models,
introductory examples in the field of large models,
building a large model knowledge base, fine-tuning, deployment, and evaluation of large models,
helping developers easily handle all aspects of large model research
and development and application from the simple to the complex.
1.3k stars, 905 forks
Huanhuan-Chat
project webpage
One of Principals and main contributer
Huanhuan-Chat is a chatbot based on LLM that mimics the tone and language style of Zhenhuan,
a character from the TV series "Empresses in the Palace,".
It is fine-tuned based on ChatGLM2 using LoRA.
The current released version 2.0 creates a personalized AI model based on novels and scripts,
offering a complete process for fine-tuning AI models.
By running the entire project process and providing any novel,
users can create a highly intelligent personalized AI that aligns with their preferred novel or script
and matches the character's personality.
477 stars, 43 forks
D2l-ai Solutions Manual
project webpage
Main contributer, responsible for Computational Performance and Optimization Algorithms.
"Dive into Deep Learning" by Li Mu is a classic book for beginners in deep learning.
This project provides solutions to the exercises in "Dive into Deep Learning",
including theoretical derivations and code implementations.
It serves as a workbook for the exercises in the book,
helping beginners to quickly understand the content.
333 stars, 63 forks
Tianji
project webpage
Main contributer, responsible for RAG.
Tianji is a free, non-commercial artificial intelligence system.
You can utilize it for tasks involving worldly wisdom,
such as "art of conversation,"
to enhance your emotional intelligence and core competitiveness.
We firmly believe that worldly wisdom are the future core competency of AI,
and let us join hands to witness the advent of general artificial intelligence.
346 stars, 30 forks
|
LLM Algorithm Intern - Bytedance
Worked in Bytedance-Tiktok,
Responsible for the exploration and implementation of LLM about data privacy and security.
LLM Algorithm Intern - Baidu
Worked in Baidu Search,
Responsible for the exploration and implementation of LLM about text generation
Responsible for the image editing assistant project based on Language User Interaction (LUI),
the project has been positively tested with low traffic.
- Based on more than twenty image editing operators,
a natural language interaction scheme based on large models was designed
- For the first phase Bad Case,
a MLLM-based solution designed, which use Qwen-VL finetuning. Achieving a 34-point increase in usability.
Responsible for the handwritten newspaper AIGC generation project,
which has been launched on the Baidu search vertical homepage.
- Design a three-stage solution: user query intent recognition + RAG retrieval + copywriting generation,
achieving a 32-point increase in the usability of the generated copy and a 16-point increase in satisfaction rate
- Recognize the real theme of user queries and image search titles by intent recognition,
use the real theme and user requirements were to improve the relevance,
use the RAG retrieval + credibility screening solve the hallucination problem
Responsible for the continued pre-training for the group's self-developed LLM,
achieved performance improvement of the model on multiple business tasks.
- Researched various LLM architectures,
experimentally evaluated the optimization plan of migrating Dense LLM to MoE architecture,
pruning from 7B model to 3B and continuing pre-training.
- Researched and implemented various length extrapolation schemes,
and through experimental comparison,
selected the expanded length pre-training + NTK scheme to achieve the model context expansion from 2K to 16K.
LLM Algorithm Intern - Ytell
Ytell is a tech innovation company centered AI and LLM technology,
whose core members come from major internet and AI tech companies such as Baidu, Didi, Alibaba, and Fourth Paradigm.
Responsible for exploring solutions related to LLMs, application implementation, and iterative optimization:
- Open-source LLM domain-specific fine-tuning,
involving workflows for multimodal data processing, efficient instruction-tuning, and constructing evaluation metric system for domain-specific LLM.
- Business problem solutions and implementation based on LLM,
including automatic order extraction, high-quality manuscript generation, intelligent assistance for user operations, etc.
- Development of a health-related question-answering assistant based on the Agent mechanism,
primarily responsible for framework ideas, data construction, model optimization, performance testing, and evaluation.
Algorithm Intern - Dr.Peng
Dr. Peng is a publicly listed group focused on the communication and internet industry,
possessing a nationwide comprehensive business operation license.
Responsible for the design and implementation of quantitative algorithms, and the construction of financial data analysis platforms:
- Implementation quantitative strategies with Python,
enabling the transformation from formal language to program descriptions.
- Developing a module for quantitative stock trading,
incorporating price segmentation based on MACD, defining price trends for quantification and so on.
- Establishing a financial data visualization platform,
involving the design of the local database structure, writing functions for remote data migration, and creating data API documentation.
Data Analysis Intern - Erawork
Erawork is a technology-driven shared workspace and office operation platform utilizing AI and big data.
Responsible for user data analysis,
including utilizing various web scraping techniques to obtain user data from the Geek platform,
leveraging the user data to create user profiles, filter out inactive users, and visualize relationship networks.
|
A Neural-Ensemble Learning Method for Migration Prediction Based on Culinary Taste Data in China
Accepted by SCI Journal of Nonlinear and Convex Analysis
Authors: Zou Yuheng, Huang Yicheng, Yan Chengxin, La Lei
Abstract: Population migration is an important problem related to national
economic and social development, and migration data can be applied to research
in many fields. But population loss often puts huge pressure on local governments,
so migration data are not disclosed in many cases. Most of the existing migration
prediction models are based on non open source data, when other researchers
want to apply existing population migration prediction models to carry out their
own prediction tasks, they often find that they cannot obtain the same data
source. This paper proposes a Neural-Ensemble learning method for migration
prediction based on taste data in China. The whole method can be divided
into three parts. First, classify the restaurants into different cuisines, calculate
the taste of each cuisine based on the recipe data and then obtain the taste
matrix of China. In this step, we propose a method for restaurant classification
called Neural-Ensemble Classification, which combines the BERT and dictionary
matching. Then we construct a Markov Chain to predict the vector of migration
at the same time with restaurant data based on the historical migration data.
Finally, we build a prediction model based on the LightGBM, which uses the
taste matrix as input and the vector of migration as output. Compared with
existing models, this model can use open data to achieve the prediction accuracy
no lower than existing models.
A Comparative Study of China And the United States' Digital Economy Policies Based on Cross-lingual Mode
Accepted by Chinese Core Conference SMP 2023
Authors: Zou Yuheng, Lu Dongyuan
Abstract: In the context of escalating Sino-American strategic competition, a comparative study of Chinese and the USA
digital economy policies bears significant strategic value. Traditional methods of policy comparison are limited by cost, can ’t
solve this problem well. This paper focuses on the contrast between digital economy policies in China and th e USA, proposing
a resolution framework based on a cross-language model. This framework enables the comparison of digital economy policy
environments in both countries based on massive policy data and further suggests policy recommendations for the develop ment
of the digital economy. This paper offers a solution for comparing policy environments across different political systems,
providing a comprehensive and objective portrayal of the disparities in digital economy policy environments. Concurrently, it
also brings a fresh perspective to policy comparison research.
Why Guests Write Negative Comments for Budget Hotels:Research Based on Aspect Extraction
Accepted by SCI Journal of Nonlinear and Convex Analysis
Authors: La Juanjuan, La Lei, Zou Yuheng
Abstract: Negative comments reflect customer dissatisfaction. Identifying this
dissatisfaction is of high significance to improve the hotel industry. At present,
the mining of negative comments mainly focuses on luxury hotels. In fact, budget
hotels occupy a large market share. This paper proposes a method for online
comment extraction in the hotel field. The method realizes weakly-supervised
learning based on BiLSTM and CRF and can further improve the extraction
performance by using a labeled open dataset in the hotel domain. The real-world
application of the proposed method reveals the dissatisfaction of economy hotel
customers mainly focuses on the price, noise, service, and cleanliness of facilities.
Experimental results also show that the proposed method has a higher F1 score
in the supervised and weakly-supervised situations than the control methods. It
is a powerful tool for managers and researchers in the hospitality industry and
can support many downstream applications.
|
Top 3 - Competition of Text Classification and Keyword Extraction Based on Paper Abstracts
Organized by iFLYTEK
For the tasks of text classification and keyword extraction,
we proposed a solution algorithm combining 6B fine-tuned LLM with GPT-4 supervision, achieving Top 3 in the preliminary round and Top 1 in the long-term competition.
Top 3 - Competition of Job-Seeker Position Matching
Organized by iFLYTEK
for the task of matching resumes and job positions,
we proposed two solution algorithms: desensitized data re-pretraining + full-process Fine-tune Bert + long-text strategy
and feature engineering + autogluon. Both approaches achieved Top3.
Top 12 - Pu Yuan LLM Competition of InternLM
Organized by Shanghai AI Lab
Based on the open-source project Chat-Zhenhuan,
combined with InternLM,
a complete, replicable, and fully automatic system
for building a Role-Play LLM (Large Language Model) has been developed.
This system enables the construction of personalized LLMs for
any novel and any character.
On the foundation of our project, dozens of excellent Role-play projects
have been derived from the InternLM training camp.
Top 6 - National Competition for Innovative Applications of Large Language Models
Organized by dataology
Developed a research polishing tool based on LLMs,
implementing functions such as paper polishing, automatic abstract generation, and citation creation.
We are also trying to finetune a domain-specific LLM for ourselves.
Top 50 - 'Spark Cup' Cognitive LLM Scene Innovation Competition
Organized by iFLYTEK
Based on the open-source project Chat-Zhenhuan,
combined with iFlytek's LLM Spark,
we have built a personalized AI system that is highly efficient, serviceable, and suitable for commercial use.
We have proposed a solution that combines general LLMs with locally fine-tuned LLMs,
overcoming the service limitations of fine-tuning open-source LLMs.
|
© Logan Zou | Last updated: Feb. 2024
|