LLM Projects

Dynamic Activation Function for Efficient Inference of LLMs

This project will focus on the development of a dynamic activation function for efficient inference, providing hands-on experience in optimizing language models. In addition to activation functions, the research will include methods like quantization to enhance model efficiency. The proposed activation function is a dynamic linear combination of ReLU and GELU, evolving during fine-tuning by gradually assigning a greater weight to ReLU over GELU, with the final convergence to ReLU.

Pre-Activations Research for Hysteresis Activation Function

This project offers an opportunity to take part in the research of HeLU, a novel activation function that could serve as an efficient alternative to GELU. Through hands-on research, students will collect real-time statistics during neural network training, focusing on key elements such as pre-activation and gradients distributions. The project will investigate how these statistics relate to the optimal functioning of HeLU, with the goal of refining its implementation for both language and vision tasks. By the end of the project, students will gain valuable experience in evaluating activation functions and optimizing them based on real-time training data. Building on previous findings, this project aims to extend those results, with the potential for conference submission upon successful completion.

Evaluating Unlearning in Modern Large
Language Models

In today’s world, it’s vital for AI models to be able to “forget” specific pieces of data they were trained on, whether for privacy, security, or error correction. However, a major challenge is determining whether a model has truly forgotten the information.

In this project, you will be introduced to a new method for evaluating unlearning. The method generates embedding-proximity perturbations by replacing tokens with their nearest neighbors in embedding space and analyzes the resulting Input Loss Landscape (ILL) features as a sensitive indicator of residual memorization.

Various Projects in LLM Security

  • Supervisor(s): Amit Levi
  • Requirements: Basic familiarity with LLM frameworks (e.g., Hugging Face) and general deep learning principles
  • LLM_Projects_proposals
  • Status: available
Large Language Models (LLMs) are at the forefront of AI research, enabling advanced capabilities in text generation, understanding, and reasoning. However, they are also susceptible to adversarial attacks and misalignment issues. These projects will allow you to explore cutting-edge methodologies, from building robust defenses and new optimization strategies to crafting innovative applications such as educational platforms and deepfake detection. Each project is designed to enrich both your research acumen and your practical engineering skills.