Logan Zou

I am an undergraduate student in University of International Business and Economics, School of Information Technology and Management, majoring in Data Science and Big Data Technology. I am a constant learner and explorer, enthusiastic about participating in open-source projects related to NLP and LLM. I have show my open-source and academic interests as follow. If you are interested in my experiences and you have any things or questions to discuss ,do not hesitate to contact me.

Email  /  Github

profile photo

Research Interests

My research interests lie in natural language processing and large language models. I am currently a master's student at the University of International Business and Economics, supervised by Professor Dongyuan Lu. My research focuses on large language models, such as LLM-based agent, LLM finetuning, comparative analysis of industry economic texts based on LLMs and so on.

Large language models are my prospective research direction. I have a strong interest in fine-tuning LLM for specific domains, prompt engineering, dialogue strategies, personalized evaluation, application development, and other aspects related to LLM. I have been actively involved in various open-source projects related to LLM and have gained some experience.

Open-Source Experience

LLM Cookbook project webpage
  • Principal and main contributer
  • An introductory tutorial on Large Language Models (LLMs) for developers, based on the course content from Professor Andrew Ng's series on LLMs. This tutorial translates the original course content into Chinese, reproduces its example code, implements Chinese prompts, and explores multilingual contextual prompts for large models. It aims to guide Chinese developers on how to rapidly and efficiently develop powerful applications based on LLMs.
  • 11.4k stars, 1.4k forks

  • LLM Universe project webpage
  • Principal and main contributer
  • A concise and comprehensive tutorial on LLM development goals at providing a focused introduction to LLM development through a half-day course. This tutorial starts from personal knowledge assistant projects. breaks down the general process and steps of LLM development in a clear and easy-to-understand manner. Additionally, we have planned and encapsulated the project in a clear and comprehensive manner, achieving the unified integration of different LLM APIs into the project.
  • 4.4k stars, 543 forks, 2 times in Github Trending

  • Self LLM project webpage
  • Co-Principal and main contributer
  • A Chinese tutorial for open source LLMs for domestic beginners, providing a full-process guide for various open source LLMs, including environment configuration, local deployment, efficient fine-tuning, and so on. This project aims to simplify the deployment, use, and application process of open source LLMs, allowing more ordinary students and researchers to better use open source LLMs.
  • 7.9k stars, 946 forks, 3 times in Github Trending, showed in Google 2024 I/O

  • Tiny Universe project webpage
  • Co-Principal and main contributer
  • A Chinese tutorial for 'handcrafted' LLM, starting from principles and oriented towards a 'white box' approach, that revolves around the entire process of LLM. This project aims to assist readers with a foundation in traditional deep learning to build a clear and usable LLM system from the ground up, 'purely by hand'. This includes the large model itself, the RAG framework, the Agent system, and the large model evaluation system.
  • 1k stars, 96 forks

  • Thorough Pytorch project webpage
  • Main contributer, responsible for NLP and Transformer-related content.
  • An open-source Chinese tutorial of Pytorch that comprehensively covers the usage of PyTorch from theory to practice, emphasizing practicality, readability, and extensibility. It has received support from the Computer Department of People's Posts and Telecommunications Press and the MMYOLO open-source algorithm library from Shanghai Pudong AI Laboratory.
  • 2.4K stars, 408 forks

  • InternLM-Tutorial project webpage
  • Main contributer, responsible for RAG application based on InternLM.
  • A full-chain course on large models from the Shanghai Artificial Intelligence Laboratory, covering the overview of large language models, introductory examples in the field of large models, building a large model knowledge base, fine-tuning, deployment, and evaluation of large models, helping developers easily handle all aspects of large model research and development and application from the simple to the complex.
  • 1.3k stars, 905 forks

  • Huanhuan-Chat project webpage
  • One of Principals and main contributer
  • Huanhuan-Chat is a chatbot based on LLM that mimics the tone and language style of Zhenhuan, a character from the TV series "Empresses in the Palace,". It is fine-tuned based on ChatGLM2 using LoRA. The current released version 2.0 creates a personalized AI model based on novels and scripts, offering a complete process for fine-tuning AI models. By running the entire project process and providing any novel, users can create a highly intelligent personalized AI that aligns with their preferred novel or script and matches the character's personality.
  • 477 stars, 43 forks

  • D2l-ai Solutions Manual project webpage
  • Main contributer, responsible for Computational Performance and Optimization Algorithms.
  • "Dive into Deep Learning" by Li Mu is a classic book for beginners in deep learning. This project provides solutions to the exercises in "Dive into Deep Learning", including theoretical derivations and code implementations. It serves as a workbook for the exercises in the book, helping beginners to quickly understand the content.
  • 333 stars, 63 forks

  • Tianji project webpage
  • Main contributer, responsible for RAG.
  • Tianji is a free, non-commercial artificial intelligence system. You can utilize it for tasks involving worldly wisdom, such as "art of conversation," to enhance your emotional intelligence and core competitiveness. We firmly believe that worldly wisdom are the future core competency of AI, and let us join hands to witness the advent of general artificial intelligence.
  • 346 stars, 30 forks
  • Work Experience

    LLM Algorithm Intern - Bytedance
  • Worked in Bytedance-Tiktok, Responsible for the exploration and implementation of LLM about data privacy and security.

  • LLM Algorithm Intern - Baidu
  • Worked in Baidu Search, Responsible for the exploration and implementation of LLM about text generation
  • Responsible for the image editing assistant project based on Language User Interaction (LUI), the project has been positively tested with low traffic.
    • Based on more than twenty image editing operators, a natural language interaction scheme based on large models was designed
    • For the first phase Bad Case, a MLLM-based solution designed, which use Qwen-VL finetuning. Achieving a 34-point increase in usability.
  • Responsible for the handwritten newspaper AIGC generation project, which has been launched on the Baidu search vertical homepage.
    • Design a three-stage solution: user query intent recognition + RAG retrieval + copywriting generation, achieving a 32-point increase in the usability of the generated copy and a 16-point increase in satisfaction rate
    • Recognize the real theme of user queries and image search titles by intent recognition, use the real theme and user requirements were to improve the relevance, use the RAG retrieval + credibility screening solve the hallucination problem
  • Responsible for the continued pre-training for the group's self-developed LLM, achieved performance improvement of the model on multiple business tasks.
    • Researched various LLM architectures, experimentally evaluated the optimization plan of migrating Dense LLM to MoE architecture, pruning from 7B model to 3B and continuing pre-training.
    • Researched and implemented various length extrapolation schemes, and through experimental comparison, selected the expanded length pre-training + NTK scheme to achieve the model context expansion from 2K to 16K.

    LLM Algorithm Intern - Ytell
  • Ytell is a tech innovation company centered AI and LLM technology, whose core members come from major internet and AI tech companies such as Baidu, Didi, Alibaba, and Fourth Paradigm.
  • Responsible for exploring solutions related to LLMs, application implementation, and iterative optimization:
    • Open-source LLM domain-specific fine-tuning, involving workflows for multimodal data processing, efficient instruction-tuning, and constructing evaluation metric system for domain-specific LLM.
    • Business problem solutions and implementation based on LLM, including automatic order extraction, high-quality manuscript generation, intelligent assistance for user operations, etc.
    • Development of a health-related question-answering assistant based on the Agent mechanism, primarily responsible for framework ideas, data construction, model optimization, performance testing, and evaluation.

    Algorithm Intern - Dr.Peng
  • Dr. Peng is a publicly listed group focused on the communication and internet industry, possessing a nationwide comprehensive business operation license.
  • Responsible for the design and implementation of quantitative algorithms, and the construction of financial data analysis platforms:
    • Implementation quantitative strategies with Python, enabling the transformation from formal language to program descriptions.
    • Developing a module for quantitative stock trading, incorporating price segmentation based on MACD, defining price trends for quantification and so on.
    • Establishing a financial data visualization platform, involving the design of the local database structure, writing functions for remote data migration, and creating data API documentation.

    Data Analysis Intern - Erawork
  • Erawork is a technology-driven shared workspace and office operation platform utilizing AI and big data.
  • Responsible for user data analysis, including utilizing various web scraping techniques to obtain user data from the Geek platform, leveraging the user data to create user profiles, filter out inactive users, and visualize relationship networks.

  • Research Experience

    A Neural-Ensemble Learning Method for Migration Prediction Based on Culinary Taste Data in China
  • Accepted by SCI Journal of Nonlinear and Convex Analysis
  • Authors: Zou Yuheng, Huang Yicheng, Yan Chengxin, La Lei
  • Abstract: Population migration is an important problem related to national economic and social development, and migration data can be applied to research in many fields. But population loss often puts huge pressure on local governments, so migration data are not disclosed in many cases. Most of the existing migration prediction models are based on non open source data, when other researchers want to apply existing population migration prediction models to carry out their own prediction tasks, they often find that they cannot obtain the same data source. This paper proposes a Neural-Ensemble learning method for migration prediction based on taste data in China. The whole method can be divided into three parts. First, classify the restaurants into different cuisines, calculate the taste of each cuisine based on the recipe data and then obtain the taste matrix of China. In this step, we propose a method for restaurant classification called Neural-Ensemble Classification, which combines the BERT and dictionary matching. Then we construct a Markov Chain to predict the vector of migration at the same time with restaurant data based on the historical migration data. Finally, we build a prediction model based on the LightGBM, which uses the taste matrix as input and the vector of migration as output. Compared with existing models, this model can use open data to achieve the prediction accuracy no lower than existing models.

  • A Comparative Study of China And the United States' Digital Economy Policies Based on Cross-lingual Mode
  • Accepted by Chinese Core Conference SMP 2023
  • Authors: Zou Yuheng, Lu Dongyuan
  • Abstract: In the context of escalating Sino-American strategic competition, a comparative study of Chinese and the USA digital economy policies bears significant strategic value. Traditional methods of policy comparison are limited by cost, can ’t solve this problem well. This paper focuses on the contrast between digital economy policies in China and th e USA, proposing a resolution framework based on a cross-language model. This framework enables the comparison of digital economy policy environments in both countries based on massive policy data and further suggests policy recommendations for the develop ment of the digital economy. This paper offers a solution for comparing policy environments across different political systems, providing a comprehensive and objective portrayal of the disparities in digital economy policy environments. Concurrently, it also brings a fresh perspective to policy comparison research.

  • Why Guests Write Negative Comments for Budget Hotels:Research Based on Aspect Extraction
  • Accepted by SCI Journal of Nonlinear and Convex Analysis
  • Authors: La Juanjuan, La Lei, Zou Yuheng
  • Abstract: Negative comments reflect customer dissatisfaction. Identifying this dissatisfaction is of high significance to improve the hotel industry. At present, the mining of negative comments mainly focuses on luxury hotels. In fact, budget hotels occupy a large market share. This paper proposes a method for online comment extraction in the hotel field. The method realizes weakly-supervised learning based on BiLSTM and CRF and can further improve the extraction performance by using a labeled open dataset in the hotel domain. The real-world application of the proposed method reveals the dissatisfaction of economy hotel customers mainly focuses on the price, noise, service, and cleanliness of facilities. Experimental results also show that the proposed method has a higher F1 score in the supervised and weakly-supervised situations than the control methods. It is a powerful tool for managers and researchers in the hospitality industry and can support many downstream applications.
  • Competition Experience

    Top 3 - Competition of Text Classification and Keyword Extraction Based on Paper Abstracts
  • Organized by iFLYTEK
  • For the tasks of text classification and keyword extraction, we proposed a solution algorithm combining 6B fine-tuned LLM with GPT-4 supervision, achieving Top 3 in the preliminary round and Top 1 in the long-term competition.

  • Top 3 - Competition of Job-Seeker Position Matching
  • Organized by iFLYTEK
  • for the task of matching resumes and job positions, we proposed two solution algorithms: desensitized data re-pretraining + full-process Fine-tune Bert + long-text strategy and feature engineering + autogluon. Both approaches achieved Top3.

  • Top 12 - Pu Yuan LLM Competition of InternLM
  • Organized by Shanghai AI Lab
  • Based on the open-source project Chat-Zhenhuan, combined with InternLM, a complete, replicable, and fully automatic system for building a Role-Play LLM (Large Language Model) has been developed. This system enables the construction of personalized LLMs for any novel and any character. On the foundation of our project, dozens of excellent Role-play projects have been derived from the InternLM training camp.

  • Top 6 - National Competition for Innovative Applications of Large Language Models
  • Organized by dataology
  • Developed a research polishing tool based on LLMs, implementing functions such as paper polishing, automatic abstract generation, and citation creation. We are also trying to finetune a domain-specific LLM for ourselves.

  • Top 50 - 'Spark Cup' Cognitive LLM Scene Innovation Competition
  • Organized by iFLYTEK
  • Based on the open-source project Chat-Zhenhuan, combined with iFlytek's LLM Spark, we have built a personalized AI system that is highly efficient, serviceable, and suitable for commercial use. We have proposed a solution that combines general LLMs with locally fine-tuned LLMs, overcoming the service limitations of fine-tuning open-source LLMs.


  • © Logan Zou | Last updated: Feb. 2024