LLM Projects

Dynamic Activation Function for Efficient Inference of LLMs

This project will focus on the development of a dynamic activation function for efficient inference, providing hands-on experience in optimizing language models. In addition to activation functions, the research will include methods like quantization to enhance model efficiency. The proposed activation function is a dynamic linear combination of ReLU and GELU, evolving during fine-tuning by gradually assigning a greater weight to ReLU over GELU, with the final convergence to ReLU.

Pre-Activations Research for Hysteresis Activation Function

This project offers an opportunity to take part in the research of HeLU, a novel activation function that could serve as an efficient alternative to GELU. Through hands-on research, students will collect real-time statistics during neural network training, focusing on key elements such as pre-activation and gradients distributions. The project will investigate how these statistics relate to the optimal functioning of HeLU, with the goal of refining its implementation for both language and vision tasks. By the end of the project, students will gain valuable experience in evaluating activation functions and optimizing them based on real-time training data. Building on previous findings, this project aims to extend those results, with the potential for conference submission upon successful completion.

Various Projects in LLM Security

  • Supervisor(s): Amit Levi
  • Requirements: Basic familiarity with LLM frameworks (e.g., Hugging Face) and general deep learning principles
  • LLM_Projects_proposals
  • Status: available
Large Language Models (LLMs) are at the forefront of AI research, enabling advanced capabilities in text generation, understanding, and reasoning. However, they are also susceptible to adversarial attacks and misalignment issues. These projects will allow you to explore cutting-edge methodologies, from building robust defenses and new optimization strategies to crafting innovative applications such as educational platforms and deepfake detection. Each project is designed to enrich both your research acumen and your practical engineering skills.